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FOREWORD 


The  Software  Engineering  Laboratory  (SEL)  is  an  organization  sponsored  by  the  National 
Aeronautics  and  Space  Ad  ministration/ Goddard  Space  Flight  Center  (NASA/GSFC)  and 
created  for  the  purpose  of  investigating  the  effectiveness  of  software  engineering  technologies 
when  applied  to  the  development  of  applications  software.  The  SEL  was  created  in  1977  and 
has  three  primary  organizational  members: 

NASA/GSFC,  Systems  Development  Branch 

The  University  of  Maryland,  Computer  Science  Department 

Computer  Sciences  Corporation.  Systems  Development  Operation 

The  goals  of  the  SEL  are  (1)  to  understand  the  software  development  process  in  the  GSFC 
environment:  (2)  to  measure  the  effect  of  various  methodologies,  tools,  and  models  on  this 
process:  and  (3)  to  identify  and  then  to  apply  successful  development  practices.  The  activities, 
findings,  and  recommendations  of  the  SEL  are  recorded  in  the  Software  Engineering  Labora- 
tory Series,  a continuing  series  of  reports  that  includes  this  document. 

Single  copies  of  this  document  can  be  obtained  by  writing  to 

Systems  Development  Branch 
Code  552 

Goddard  Space  Flight  Center 
Greenbelt,  Maryland  20771 
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FIFTEENTH  ANNUAL  SOFTWARE  ENGINEERING  WORKSHOP 
NASA/GODDARD  SPACE  FLIGHT  CENTER 
BUILDING  8 AUDITORIUM 
NOVEMBER  28-29, 1990 


Summary  of  Presentations 

D.  K.  Cover  and  E.  J.  Smith  (Computer  Sciences  Corporation) 

Session  1 

Topic  The  SEL  at  Age  15 

Toward  a Mature  Measurement  Environment 

V.  R.  Basili  (University  of  Maryland) 

Impacts  of  a Process  Improvement  Program  in  a Production  Environment 
G.  T Page  (Computer  Sciences  Corporation) 

Results  of  15  Years  of  Measurement  in  the  SEL 

F.  E.  McGany  (NASA/Goddard  Space  Flight  Center) 

Session  2 

Topic  Process  Improvement 

A Framework  for  Assessing  the  Adequacy  and  Effectiveness  of  Software  Development  Methodolo- 
gies 

J.  D.  Arthur  and  R.  E.  Nance  (Virginia  Polytechnic  Institute) 

A Method  for  Tailoring  the  Information  Content  of  a Software  Process  Model 
M.  B.  Arends  (McDonnell  Douglas) 

S.  Perkins  (University  of  Houston) 

Software  Technology  Insertion : A Study  of  Success  Factors 
R.  Lydon  (Raytheon) 

Session  3 

Topic  Measurement 

Pragmatic  Quality  Metrics  for  Evolutionary  Software  Development  Models 

W.  Royce  (TRW) 

RWP  Project  Ada  Development  Metrics  and  Observations 
R.  E.  Loesh  (NASA/Jet  Propulsion  Laboratory) 
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Early  Experiences  Building  a Software  Quality  Prediction  Model 

W.  W.  Agresti,  W.  M.  Evanco,  and  M.  C.  Smim  (The  MITRE  Corporation) 

Session  4 
Topic:  Reuse 

Bias  and  Design  in  Software  Specifications 

P.  A.  Straub  and  M.  V.  Zelkcrwitz  (University  of  Maryland) 

SEL  Ada  Reuse  Analysis  and  Representation 
R.  Kester  (Computer  Sciences  Corporation) 

Reuse  Metrics  -d  Measurements:  A Framework 
D.  Reifer  (Reiter  Consultants,  Incorporated) 

Session  5 

Topic:  Process  Assessment 

Cost  and  Quality  Planning  for  Large  NASA  Programs 

K.  Y.  Rone  (IBM) 

Effect  of  Formal  Specifications  on  Program  Complexity  and  Reliability:  An  Experimental  Study 
A.  L Goel  and  S.  N.  Sahoo  (Syracuse  University) 

An  Analysis  of  Defect  Densities  Found  During  Software  Inspections 
J.  C.  Kelly  and  J.  Hops  (NASA/Jet  Propulsion  Laboratory) 

J.  S.  Sherif  (California  State  University) 

Panel  1 

Topic:  Experiences  in  Implementing  an  Effective  Measurement  Program 

Michael  Daskalantonakis  (Motorola) 

Bob  Grady  (Hewlett-Packard) 

Ray  Wolverton  (Hughes  Aircraft  Company) 

Mitsuru  Ohba  (IBM/Japan) 


Panel  2 

Topic:  Software  Engineering  in  the  1980s:  Most  Significant  Achievements/Greatest 

Disappointments 

Barry  Boehm  (Defense  Advanced  Research  Projects  Agency,  Information  Sciences  Technol- 
ogy Office) 

Larry  Druffel  (Software  Engineering  Institute) 

Manny  Lehman  (Imperial  College) 

Harlan  Mills  (Software  Engineering  Technology,  Inc) 

Vic  Basili  (University  of  Maryland) 
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SUMMARY  OF  PRESENTATIONS  AND  PANELS 


Donna  Cover 
Elizabeth  Smith 


COMPUTER  SCIENCES  CORPORATION 


SUMMARY  OF  THE  FIFTEENTH  ANNUAL  SOFTWARE 
ENGINEERING  WORKSHOP 


On  November  28  and  29, 1990,  approximately  500  attendees  gathered  in  Building  8 at  the  National 
Aeronautics  and  Space  Administration  (NASAyGoddard  Space  Flight  Center  (GSFC)  for  the 
Fifteenth  Annual  Software  Engineering  Workshop.  The  meeting  is  held  each  year  as  a forum  for 
information  exchange  in  the  measurement,  utilization,  and  evaluation  of  software  methods,  models, 
and  tools.  It  is  sponsored  by  the  Software  Engineering  Laboratory  (SEL),  a cooperative  effort  of 
NASA/GSFC,  Computer  Sciences  Corporation,  and  the  University  of  Maryland.  Among  the  audi- 
ence were  representatives  from  approximately  10  universities,  30  government  agencies,  9 NASA 
centers,  and  100  private  corporations  and  institutions.  Fifteen  papers  were  presented  in  five  ses- 
sions: 

• The  SEL  at  Age  15 

• Process  Improvement 

• Measurement 

• Reuse 

• Process  Assessment 

The  sessions  were  followed  by  two  panel  discussions: 

• Experiences  in  Implementing  an  Effective  Measurement  Program 

o Software  Engineering  in  the  1980s:  Most  Significant  Achievements/Greatest  Disappoint- 
ments 

A summary*  of  the  presentations  and  panel  discussions  is  given  on  the  following  pages. 
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SESSION  1— THE  SEL  AT.  AGE  IS 


Frank  McGarry  of  GSFC  introduced  the  workshop,  welcomed  attendees,  and  gave  a brief  overview 
of  the  SEL  He  noted  that  on  this  15  th  anniversary  of  the  SEL,  it  is  appropriate  to  look  back  on  what 
has  been  learned:  How  has  the  model  of  software  process  improvement  evolved?  What  impact 
have  the  SEL  experiments  had  on  the  production  environment?  What  has  the  government  learned 
about  software  from  these  experiences  over  15  years? 

The  three  speakers  for  this  session  were  Victor  Basili  of  the  University  of  Maryland,  Gerald  Page  of 
Computer  Sciences  Corporation,  and  Frank  McGany  of  the  Goddard  Space  Flight  Center* 

Toward  a Mature  Measurement  Environment 

victor  Basili 

Basili  discussed  the  SEL  from  a research  perspective,  focusing  on  software  measurement  maturity. 
He  indicated  that  a problem  in  software  engineering  is  that  most  of  the  research  being  done  is 
“bottom-up/  that  is,  researchers  are  packaging  pieces  of  technology  that  can’t  be  put  together  very 
easily.  The  software  engineering  community  needs  to  create  a “top-down,”  experimental,  evolu- 
tionary framework  that  can  be  focused,  logically  and  physically  integrated  to  produce  quality 
software  productivity,  and  evaluated  and  tailored  for  the  particular  application  environment.  In 
short,  what  is  needed,  said  Basili,  are  more  experimental  laboratories  conducting  SEL-type  activi- 
ties. 

From  a researcher’s  point  of  view,  the  SEL  is  a laboratory  that  helps  the  researcher  to  understand 
the  various  software  processes  and  products.  With  this  understanding,  the  researcher  builds  de- 
scriptive models  of  the  processes  and  products  that  promote  even  greater  understanding  and 
deeper  analysis.  Basili  outlined  three  phases  in  the  evolution  of  the  SEL  During  Phase  1.  the  em- 
phasis was  on  understanding  the  environment  and  measurement.  Phase  2 focused  on  improving 
the  process  and  the  product.  Phase  3 involved  packaging  the  SEL  experiences  for  reuse,  recogniz- 
ing what  was  appropriate  for  the  SEL  and  implementing  improvements  to  the  SEL  environment. 
He  stressed  that  the  SEL  process  is  a hierarchy:  improvement  depends  on  understanding,  and 
assessment  and  improvement  always  precede  packaging. 

Basili  explained  that  the  SEL  framework  is  based  on  three  evolving  concepts:  the  Quality  Improve- 
ment Paradigm,  the  Goal-Question-Metric  (GQM)  paradigm,  and  the  experience  factory.  These 
concepts  support  the  basic  SEL  belief  that  experimentation  is  necessary:  no  piece  of  technology, 
method,  tool,  or  process  model  works  under  all  circumstances. 

One  example  of  SEL  experimentation  is  the  current  evaluation  of  the  Cleanroom  process  to  deter- 
mine its  applicability  for  building  flight  dynamics  software.  A small,  controlled  Cleanroom  experi- 
ment was  successfully  run  at  the  University  of  Maryland.  From  this  experiment,  it  was  determined 
that  the  concept  of  “no  programmer  testing”  enforces  better  code  reading,  that  the  Cleanroom 
process  is  quite  effective  for  small  projects,  and  that  formal  methods  were  hard  to  apply  and  re- 
quired a fair  amount  of  skill.  Building  on  this  initial  work,  the  Cleanroom  experiment  was  then 
extended  to  a larger  case  study  at  NASA.  Key  lessons  learned  from  this  study  were  that  it  is  possible 
to  scale  up  from  a smaller  project  to  a larger  one  and  that  the  use  of  Cleanroom  techniques  survived 
very  well  in  an  environment  with  changing  requirements.  Overall  there  were  increased  productiv- 
ity and  a lower  error  rate.  A negative  lesson  learned  was  that  better  training  was  needed.  An  even 
larger  lesson  was  that  successful  software  engineering  requires  allowing  for  change. 
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Basili  stated  that  packaging  technology  requires  two  things:  (1)  the  continual  accumulation  of 
evaluated  experiences  in  a form  that  can  be  effectively  understood  and  modified  and  (2)  their 
collection  into  a repository  of  integrated  experience  models.  The  experience  factory  concept  sup- 
ports technology  packaging  by  analyzing  and  synthesizing  all  kinds  of  experience  models,  acting  as 
a repository  for  such  experience,  and  supplying  information  on  that  experience  to  various  projects 
upon  demand. 

What  is  needed  now,  according  to  Basili,  is  a set  of  experience  factories  in  a variety  of  domains,  each 
focused  on  packaging  local  experiences,  building  and  tailoring  local  models,  integrating  technolo- 
gies, studying  the  issue  of  scale  up,  and  developing  automated  aids.  Basili  encouraged  researchers 
to  take  advantage  of  the  experimental  nature  of  software  engineering  to  provide  gains  both  to  indus- 
try and  to  the  research  domain  in  a symbiotic  relationship;  he  further  encouraged  them  to  learn 
from  and  build  on  the  experiences  of  the  SEL. 


Impacts  of  a Process  Improvement  Program  in  a Production  Environment 

Gerald  Page 


In  assessing  the  impact  of  a process  improvement  program  in  a production  environment.  Page’s 
presentation  addressed  the  question  “Are  we  [CSC  as  part  of  the  SEL]  any  better  for  trying  to  create 
an  optimizing  process?”  Page  explained  that  the  SEL  process  improvement  environment  has  been 
characterized  by  a conscious,  continuous  effort  to  build  higher  quality  systems  by  (1)  understand- 
ing the  environment,  (2)  proposing  changes  to  that  environment  and  then  measuring  and  evaluating 
those  changes  against  baselines,  and  (3)  capturing  and  packaging  that  experience  to  optimize  the 
process  and  anticipate  uncontrollables.  These  continuous  efforts  to  improve  have  led  to  numerous 
changes  over  the  last  15  years. 

Following  are  three  views  of  the  types  of  changes  made  in  the  SEL 


• Life-Cycle  Process  Changes:  Testing  activity  was  identifled  as  a weak  area  for  the  SEL. 
Investigations  of  various  testing  techniques,  including  the  independent  verification  and 
validation  methodology  (TV&  V),  were  conducted.  IV&V  was  judged  to  be  inappropri- 
ate for  the  SEL  but  code  reading,  found  to  be  by  far  the  most  cost-effective  testing  tech- 
nique, was  added  to  the  SEL  process. 


• Technology/Methodology  Changes:  Small  improvements  were  experienced  from  simply 

introducing  a disciplined  methodology  into  the  environment,  but  it  was  concluded  that 
gaining  significant  improvement  would  require  dramatic  changes  in  the  methodology, 
technology,  or  organizational  structure  of  the  SEL  The  subsequent  introduction  of  Ada 
technology  in  the  SEL  produced  very  promising  results  in  software  reuse,  leading  to  fur- 
ther attempts  to  optimize  the  process.  Cleanroom  techniques  have  also  recently  been 
successfully  used. 


• Organizational  Changes:  Three  organizational  changes  were  involved:  staff  turnover  or 
growth,  staff  background,  and  domain  growth. 

The  problems  presented  by  staff  tumover/growth  were  addressed  by  creating  new 
standards  and  guidelines  and  augmenting  existing  ones.  These  standards  have 
created  a more  homogeneous  environment,  in  which  the  use  of  consistent  terminol- 
ogy and  procedures  minimizes  disruptions  due  to  staff  turnover.  A Software  Man- 
agement Environment  (SME)  tool  has  also  proven  useful  in  allowing  a manager  to 
compare  a current  project  with  past  projects  to  determine  the  outcome  for  any  new 
project  thought  to  be  deviating  from  the  process  models. 

Staff  background  has  changed  over  the  15  years  from  primarily  mathematics  and 
physics  to  primarily  computer  science.  The  necessary  flight  dynamics  training  has 
been  provided  through  a required  training  program  designed  to  help  software 
developers  understand  the  engineering  applications  in  the  SEL  environment. 

Domain  growth  (requiring  a staffing  increase  from  35  to  approximately  250  devel- 
opers over  15  years)  has  been  addressed  by  augmenting  the  SEL  methodology, 
making  it  more  flexible  and  able  to  handle  new  applications. 

Page  identified  several  ways  in  which  CSC  has  capitalized  on  the  learning  from  these  experiences 
and  changes: 

• A system  development  methodology  has  been  devised  based  on  the  SEL  experience, 
with  standards  and  procedures  developed  to  make  that  methodology  more  effective. 
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• The  SEL  experience  with  quantitative  management  has  been  packaged  in  a manager’s 
handbook  and  in  the  “Data  Collection,  Analysis,  and  Reporting  Handbook."  This  gives 
managers  guidance  on  how  to  monitor  a project  and  how  to  predict  what’s  going  on. 

• Required  training  programs  have  been  developed  for  the  different  roles  in  system  devel- 
opment; analysis  engineers,  developers,  testers,  integrators,  and  managers. 

• Measurement-based  engineering  process  groups  (researchers,  experience  factory 
people,  experimenter  teams)  have  been  established.  These  groups  recommend  changes, 
develop  measures  to  evaluate  the  changes,  and  then  package  the  information  to  institu- 
tionalize the  improvements  found. 

Citing  statistics  in  several  categories.  Page  related  that  although  system  complexity,  general  re- 
quirements, and  system  size  have  doubled,  error  rates  have  gone  down,  the  cost  of  code  has  re- 
mained relatively  constant,  and  effort  and  schedule  estimating  are  improving.  Therefore,  the 
answer  to  the  question  “Are  we  any  better  off?”  is  a resounding  “YES." 
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Results  of  15  Years  of  Measurement  in  the  SEL 
FVank  McGarrv 

McGany’s  presentation  focused  on  what  has  happened  in  the  SEL  over  the  past  15  years  and  dis- 
cussed areas  in  which  SEL  measurement  efforts  have  made  substantial  contributions  to  NASA/ 
Goddard’s  understanding  of  software  engineering. 

During  this  period  of  time,  over  75  to  85  projects  supporting  NASA  missions  in  a production  envi- 
ronment have  been  studied  in  SEL  experiments.  From  these  experiments,  a large  amount  of  soft- 
ware metrics  data  has  been  extracted  to  generate  the  SEL  data  base.  In  addition,  150  to  200  reports 
have  been  produced  describing  experiences  and  results  of  the  experiments.  McGarry  stated  that 
the  SEL  has  gained  seven  major  insights  from  these  project  experiments  and  reports,  with  each  of 
the  points  substantially  confirmed  by  multiple  experiments  and  documented  in  the  SEL  literature. 

Measurement  is  an  essential  element  of  software  process  improvement.  McGarry  stressed  that 
measurement  is  critical  in  developing  a baseline,  in  understanding  changes  to  an  organization’s 
process,  and  in  assessing  impacts  of  these  process  changes  on  its  products.  Because  of  measure- 
ment in  the  SEL,  impacts  of  process  changes  have  been  observed  and  determinations  made  if  the 
changes  have  in  fact  lead  to  improvements.  Measurement  has  also  proven  valuable  as  a manage- 
ment aid;  for  example,  measuring  error  rates  has  provided  an  early  indication  of  software  quality. 

Many  potential  diversions  exist  that  can  sidetrack  a measurement-based  process  improvement 
program.  The  SEL  has  experienced  three  such  areas  of  diversion;  excessive  planning  and  replan- 
ning, over  dependence  on  statistical  analysis,  and  spending  too  much  time  looking  at  methodolo- 
gies and  technologies  that  in  the  end  don’t  improve  the  process.  McGarry  cautioned  that  these  and 
other  diversions  should  be  watched  for  and  avoided. 

People  are  the  most  important  resource/technology.  There  is  a tremendous  difference  in  people’s 
potential.  However,  prudently  applied  methodologies/technologies  can  optimize  the  potential  of 
people,  and  this  is  where  the  focus  should  be. 

Environmental  characteristics  should  dictate  the  selected  software  engineering  techniques.  Spe- 
cific measures/techniques  may  not  apply  to  all  domains.  Methodologies  and  technologies  must  be 
tailored  to  the  environment,  with  associated  standards  and  policies  tailored  to  optimize  the  partic- 
ular processes  found  to  be  effective  in  that  environment.  The  SEL  definition  of  effective  standards 
requires  that  they  be  written,  understood,  legacy-based,  enforced,  and  measurable. 

Automation  is  an  instrument  of  process  improvement,  not  a replacement  for  process  understand- 
ing. Only  those  processes  that  are  very  clearly  understood  and  that  can  be  done  manually  can  be 
automated,  lools  can  provide  significant  benefit  to  a well-defined  experience  base,  and  effective 
tools  must  address  defined  process  needs.  Immature  processes  are  not  automatable. 

The  heritage  of  an  environment  will  strongly  influence  the  process  followed.  It  was  found  that 
making  significant  changes  to  the  process  nonetheless  induces  very  slow  changes  in  the  environ- 
ment and  the  products  generated,  because  corporate  memory  remains  the  overriding  principle  of 
the  way  software  is  developed  in  that  environment  Significant  process  change  also  requires  a sig- 
nificant expenditure  of  effort  and  time. 

Software  can  be  measurably  improved  through  the  appropriate  use  of  available  technologies.  A 
combination  of  the  appropriate  software  methodology  and  technology  can  produce  a significant, 
favorable  impact  on  the  way  software  is  developed.  Examples  of  technologies  that  have  worked 
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successfully  in  the  SEL  are  code  reading,  design  criteria,  use  of  Ada.  objected-oriented  techniques. 
Clean  room,  and  management  through  measurement.  Code  reading  has  repeatedly  been  shown  to 
improve  software  reliability  at  essentially  no  additional  cost.  Design  criteria  standards  have  been 
demonstrated  to  produce  more  error-free  software.  Use  of  Ada  and  object-oriented  design  have 
yielded  substantial  cost  benefits  through  reuse.  The  Cleanroom  technology  has  shown  improve- 
ment in  reliability  and  productivity  and  has  helped  reduce  computer  resources  consumption.  Fi- 
nally, major  changes  in  planning,  adjusting,  and  control  techniques  have  improved  cost  and 
schedule  estimation. 

McGany  stated  that  NASA’s  investment  in  the  SEL  has  been  substantial,  but  comparing  various 
aspects  of  software  development  in  the  1976-1980  timeframe  with  those  of  the  1986-1990  timeframe 
demonstrates  that  there  have  also  been  many  benefits.  McGany  specifically  cited  such  benefits  as 
very  controlled,  predictable  cost  per  line  of  code:  increased  reliability:  increased  reuse:  and  de- 
creased rework.  Other  major  improvements  have  been  increased  manageability,  with  less  depen- 
dence on  the  capabilities  of  the  personnel;  the  production  of  more  predictable  and  consistent 
software:  and  the  development  of  a rationale  for  the  methods  used. 


CdveT/Smith 


ITT 


• « 


T3J 


SESSION  2 — PROCESS  IMPROVEMENT 

Maxvin  Zelkowitz  of  the  University  of  Maryland  chaired  this  session.  In  his  opening  remarks,  he 
stated  that  the  papers  in  this  session  addressed  the  following  problem:  If  most  organizations  use 
the  typical  “waterfall”  software  life  cycle,  what  variations  in  the  process  will  improve  the  quality  of 
the  software  developed? 

The  three  speakers  in  this  session  were  Richard  Nance  of  the  Virginia  Polytechnic  Institute  (VP I), 
Mark  Arend  of  McDonnell  Douglas,  and  Tbm  Lydon  of  Raytheon. 

A Framework  for  Assessing  the  Adequacy  and  Effectiveness 
of  Software  Development  Methodologies 
Richard  Nance 

Nance  reported  that  this  work  arose  when  his  group  was  asked  to  review  two  software  development 
methodologies  (A  and  B);  to  compare  and  evaluate  them;  and  to  assess  the  costs  and  benefits  of 
continuing  with  both,  using  only  one,  or  merging  the  two  in  some  fashion.  The  major  steps  in  the 
study  were  determining  an  evaluation  approach,  developing  an  evaluation  procedure;  applying  the 
evaluation  procedure,  and  summarizing  the  results. 

Nance  explained  that  since  no  comparative  development  procedures  were  found  documented  in  the 
literature,  a relevant  evaluation  procedure  was  first  developed  and  then  applied.  The  rationale  for 
VPTs  evaluation  procedure  was  the  following:  A project-level  software  development  methodology 
should  have  a set  of  clear  objectives;  a process  that  clearly  defines  the  principles  needed  to  reach 
those  objectives;  and,  finally,  adherence  to  a process  utilizing  these  principles  that  produces  a prod- 
uct with  certain  identifiable  attributes.  This  philosophy  must  be  tempered  by  practical  concerns, 
such  as  keeping  the  framework  open,  adjusting  to  differing  priorities,  recognizing  attribute  sam- 
pling, and  being  flexible  in  evaluation  procedure  application. 

The  study  team  determined  a statement  of  objectives,  principles,  and  attributes  for  methodologies 
A and  B;  identified  objectives  at  the  project  level  that  led  to  principles  at  the  process  level;  and  lastly 
defined  attributes  at  the  product  level.  The  objectives  and  principles  were  then  linked,  showing  a 
high  degree  of  interplay  among  principles.  Properties,  the  measurable  things  that  reflect  the  pres- 
ence or  absence  of  the  desirable  or  beneficial  attributes  in  the  product,  were  also  defined. 

Linkages  between  objectives  and  principles  and  between  principles  and  attributes  were  defined 
and  substantiated.  The  evaluation  procedure  was  then  applied  in  a top-down  approach  to  deter- 
mine the  adequacy  of  a methodology.  Assessing  the  effectiveness  of  methodologies  A and  B was 
achieved  through  a bottom-up  evaluation  process. 

Finally,  Nance  presented  the  results  from  comparing  methodologies  A and  B using  this  procedure. 
He  stated  that  one  of  the  evaluation  procedure’s  greatest  strengths  was  being  able  to  describe 
things  in  terms  that  management  could  readily  understand.  Future  efforts  win  include  extending 
this  evaluation  approach  to  the  issue  of  software  quality  assessment 
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A Method  for  Tailoring  the  information  Content  of  a Software  Process  Model 

Mark  Arend 

Arend's  presentation  discussed  a procedure  for  tailoring  the  documentation  products  and  portions 
of  a methodology  dictated  by  a given  software  process  model.  Tailoring  is  the  act  of  taking  a fully 
defined  software  process  model  or  methodology  and  selecting  those  items  that  are  necessary  based 
on  the  nature  of  the  specific  product  to  be  developed.  Arend  stressed  that  software  quality,  the 
degree  to  which  software  matches  the  customer/user  needs,  is  an  important  consideration  that 
must  be  ensured  whenever  tailoring  is  employed. 

Arend  explained  that  tailoring  is  usually  guided  by  personnel  experience,  ability,  and  tradition,  and 
that  the  McDonnell  Douglas  team  found  no  formal  guidelines  for  tailoring  methodologies.  The 
procedure  they  subsequently  developed  and  followed  in  their  study  was  a step-by-step,  cohesive 
approach  to  tailoring,  one  that  had  customer  needs  and  product  quality  requirements  as  the  driving 
factors.  Customer/user  needs  were  identified  and  an  approach  used  to  reflect  these  needs  in  a 
subset  of  information  products  extracted  from  all  possible  information  products  identified  in  their 
process  model. 

To  characterize  customer/user  needs,  software  quality  assurance  (SQA)  concepts  were  applied. 
SQA  involves  defining  quality  factors  and  quality  criteria.  User-oriented  quality  factors  were  cap- 
tured through  the  use  of  questionnaires  and  customer/user  interviews.  Quality  criteria,  generally 
more  software  oriented  and  more  closely  related  to  software  testability,  were  directly  derived  from 
the  quality  factors.  Once  a good  set  of  factors  that  the  user  wanted  in  the  software  were  identified,  a 
larger  set  of  criteria  supporting  the  existence  of  those  factors  was  identified.  Development  and 
management  techniques  were  then  selected  that  would  ensure  the  presence  of  these  quality  criteria. 

In  the  key  step  of  the  tailoring  procedure,  information  products  that  matched  or  supported  the 
chosen  development  and  management  techniques  were  selected  and  tailored.  Arend  explained  that 
information  products  act  as  specific  vehicles  that  force  us  to  recognize,  formalize,  and  adhere  to 
techniques  to  specify,  design,  and  implement  software  of  appropriately  selected  quality.  Therefore, 
an  appropriate  subset  of  all  possible  information  products  becomes  a significant  aid  in  reaching 
the  goals  of  the  given  software  project  and  especially  in  satisfying  customer  needs.  Arend  also 
stated  that  if  a design  methodology  is  not  already  imposed,  one  can  be  selected  based  on  matching 
the  information  products  chosen  from  a methodology  with  those  recommended  to  achieve  the 
product’s  quality  profile. 

Steps  remaining  to  be  applied  include  refining  the  quality  requirements  questionnaire,  devising  a 
way  to  weight  questionnaire  responses  to  quantify  products’  quality  profiles,  and  developing  a list 
of  information  products  sorted  by  quality  criteria. 
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Software  Technology  Insertion;  A Study  of  Success  Factors 

Tom  Lvdon 


r 


Software  technology  insertion  (STI)  in  this  study  consisted  of  two  parts:  ( 1)  selecting  a new  technol- 
ogy, typically  a method  or  tool,  and  (2)  creating  an  opportunity  to  insert  that  new  technology  in  a 
new  or  ongoing  software  project  Lydon  clarified  that  successful  STI  can  be  a perceived  success 
(the  user’s  sense  of  labor,  computer  cost  and  time  savings)  or  a real,  measured  success.  This  study 
concentrated  on  perceived  success  by  the  actual  project  users. 

For  the  purposes  of  this  study,  an  STI  case  applied  a single  technology  to  a single  project  usually 
within  a single  development  phase.  The  study  involved  13  different  projects  with  a variety  of  project 
characteristics,  21  new  software  technologies,  and  numerous  study  factors,  among  them  technology 
type,  maturity,  insertion  method,  and  project  size. 

TWo  key  people  (the  lead  engineer  and  the  department  manager)  for  each  project  were  surveyed  for 
their  perceptions  of  success  or  failure.  Based  on  these  surveys  and  on  responses  to  six  study  ques- 
tions, each  STI  test  case  was  ranked  and  evaluated.  A dose  examination  of  the  top  11  rated  STI 
cases  indicated  that  the  main  reasons  for  success  were(l)  synergy  within  a project  (a  good  relation- 
ship between  the  using  and  supporting  organizations  and  a positive  attitude  by  the  affected  man- 
ager), (2)  critical  need  for  the  capability,  (3)  synergy  between  two  technologies,  and  (4)  use  of  a 
mature  and  powerful  tool.  It  was  significant  that  three  of  these  factors  were  organizational  in  na- 
ture, with  only  one  factor  related  to  what  was  inherent  in  the  technology.  In  these  11  cases,  saving 
computer  costs  may  or  may  not  have  occurred,  and  meeting  expectations  was  not  so  important  as 
the  perception  of  time  or  labor  savings  or  quality  improvement. 

In  the  bottom  seven  rated  STI  cases,  the  main  reasons  for  “failure”  were  (1)  the  technology  was 
immature,  (2)  interface  problems  arose,  (3)  the  technology  was  judged  to  be  “not  needed”  by  the 
lead  engineer,  and  (4)  the  wrong  technical  solution  was  used.  For  these  least  successful  cases,  the 
technologies  being  tried  may  or  may  not  have  improved  quality  butwere  judged  to  have  failed  in 
saving  time,  saving  labor,  and  in  meeting  expectations. 

Lydon  reiterated  that  this  study  had  focused  on  success  factors  and  on  perceived  STI  success.  He 
summarized  the  results  as  follows: 

• Saving  schedule  time  and  labor  costs  was  the  driving  force  behind  the  successful  STI 
cases. 

• Improving  quality  seemed  to  be  a necessary  but  not  sufficient  condition  for  successful 
STI. 

• Exceeding  users’  expectations  was  not  necessary  for  successful  STI,  but  not  meeting  ex- 
pectations was  sufficient  for  failure.  The  lesson  is  that  a support  group  or  organization 
must  control  people’s  expectations. 

■ There  was  much  greater  success  for  competence-enhancing  (incremental)  improve- 
ments than  for  competence-destroying  technologies. 

• There  was  greater  success  with  mature  versus  young  or  old  technologies. 

• There  was  somewhat  greater  success  for  in-house  versus  outside  supported  technolo- 
gies. 
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Lydoo  related  that  the  next  step  is  to  link  perceived  success  with  real  success  via  software  metrics 
collection.  Raytheon  is  implementing  corporate-wide,  automatic  software  metrics  collection  as  a 
by-product  of  development 
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SESSION  3 -SOFTWARE  MEASUREMENT 

The  Session  3 chairman  was  John  Valett  of  the  Goddard  Space  Flight  Center.  The  three  papers  in 
this  session  addressed  different  aspects  of  software  measurement:  Walker  Royce  of  TRW  dis- 
cussed quality  metrics  and  how  they  might  be  utilized.  Bob  Loesh  of  NASA/JPL  presented  results 
from  a specific  measurement  effort  and  Bill  Agresti  of  the  Mitre  Corporation  discussed  using  de- 
sign measures  to  predict  system  quality.  These  measurement  activities  share  the  ultimate  goal  of 
showing  how  measurement  can  be  used  to  better  understand  and  improve  a software  product 

Pragmatic  Quality  Metrics  for  Evolutionary  Software  Development  Models 

WalKcLRoYtc 

Royce’s  presentation  discussed  experiences  on  a 1-miilion-line  Ada  project,  entitled  the  Command 
Center  Processing  and  Display  System  Replacement  (CCPDS-R),  undertaken  for  the  United 
States1  Ballistic  Missile  Early  Warning  Center.  TRW  recognized  that  its  transition  to  developing 
this  and  other  such  large  Ada  systems  would  require  significant  internal  research  to  identify 
changes  necessary  for  their  existing  software  development  and  metrics  collection/analysis  ap- 
proaches. Several  such  changes  included  adopting  an  evolutionary  versus  canonical  waterfall  de- 
velopment approach,  using  Ada  as  a compilable  design  language  as  well  as  implementation 
language,  and  adjusting  cost  and  schedule  estimating  techniques  accordingly.  A 350,000-line  sub- 
system was  used  as  the  pilot  for  selecting  meaningful  metrics,  collecting  data,  and  analyzing  results. 

The  main  objective  of  adopting  the  evolutionary  development  approach  was  to  minimize  rework. 
This  required  fixing  things  early  and  designing  for  change  to  accommodate  requirements  volatility. 
The  focus  of  TRW’s  software  quality  metrics  (SQM)  was  maintainability— how  easy  would  it  be  to 
change  the  software  throughout  the  lifecycle.  Interpersonal  communications  were  also  minimized 
using  a small,  expert  design  team;  a layered  architecture;  and  Ada  as  a self-documenting,  life-cycle 
language. 

Quality  was  defined  as  the  degree  of  compliance  with  customer  expectations  of  function,  perform- 
ance, cost,  and  schedule.  Quality  metrics  were  derived  from  measuring  the  amount  of  rework  and 
plotting  these  measures  as  they  evolved  over  time.  The  evolutionary  development  approach  sup- 
ported this  quality  assessment  by  permitting  tangible  insight  into  the  end  product  beginning  with 
the  very  earliest  stages  of  the  program.  For  example,  because  a significant  amount  of  coding,  test- 
ing, and  product  demonstration  had  already  been  conducted  by  critical  design  review  (CDR),  the 
CCPDS-R  approach  allowed  prediction  in  some  objective  terms  of  the  future  maintainability  and 
reliability  of  the  software. 

Development  progress  over  the  life  cycle  of  the  program  included  approximately  six  builds,  reach- 
ing lOG-percent  development  around  month  35.  Royce  related  that  fewer  than  0.5  problems  per 
1000  lines  of  code  occurred  during  the  development  and  test  phases.  Addressing  difficult  design 
issues  early  to  avoid  major  “breakage”  later,  when  there  would  be  a larger  configuration  to  main- 
tain, was  a major  project  goal  in  containing  rework.  Tracking  total  rework  versus  closed  rework 
provided  useful  data  as  a progress  indicator.  Insight  into  the  activities  of  the  test  and  maintenance 
organizations  and  reacting  to  problems  before  they  escalated  were  also  important  to  the  success  of 
the  program. 

Royce  summarized  some  of  the  metrics  from  the  effort.  Rework  Proportions:  6.7  percent  of  the 
total  manpower  devoted  to  software  was  spent  doing  rework  on  configuration  baselines.  Approxi- 
mately 13 5 percent  of  the  total  product  had  to  be  reworked  prior  to  being  delivered.  Modularity: 
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The  average  breakage  per  change  was  approximately  53  SLOC  per  system  change  order  (SCO). 
Changeability:  The  average  SCO  took  approximately  2 man-days  to  resolve  and  fix.  Change  effort 
became  quite  predictable  and  stabilized  over  time,  demonstrating  the  success  of  the  approach  and 
the  usefulness  of  the  metrics.  Maintainability:  Maintainability  was  defined  as  a normalized  rework 
productivity.  The  software  in  this  effort  was  determined  to  be  approximately  one-half  as  complex  to 
change  as  it  was  to  develop  from  scratch. 

Rorce  concluded  that  selecting  and  implementing  meaningful  software  quality  metrics  require  con- 
sistency of  application,  the  use  of  automated  tools,  and  management  and  practitioner  acceptance. 
One  advantage  of  the  TRW  approach  was  that  it  produced  quantitative  data  for  decision  making 
and  for  determining  requirements  compliance  in  areas  such  as  maintainability,  modularity,  and 
adaptability.  It  also  provided  historical  data  for  better  future  planning.  The  bottom  line  is  that 
such  quality  metrics  can  be  and  are  being  used  effectively  on  large  projects. 
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RWP  Project  Ada  Development  Metrics  and  Observations 

Rotert  Lresh 


Loesh’s  presentation  discussed  metrics  experiences  cn  the  Real-time  Weather  Processor  (RWP) 
System  being  developed  by  JPL  for  the  Federal  Aviation  Administration  (FAA).  The  software- 
intensive  system,  comprised  of  approximately  97,000  Ada  statements  and  280,000  lines  of  commer- 
cial, off-the-shelf  software  (COTS)  implemented  in  C,  has  been  installed  on  commercially  available 
hardware.  Over  the  project’s  3-year  development  period,  JPL  intensively  tracked  Software  Prob- 
lem Failure  Report  (SPFR)  activity  and  performed  extensive  analysis  of  error  counts,  time -to- fix. 
requirements  documents  changes,  and  system  specification  changes. 

According  to  Loesh.  approximately  40  percent  of  the  222  issues  addressed  by  the  design  team  dealt 
with  interface  concerns,  a strength  predicted  by  Ada  advocates  and  clearly  exploited  by  RWP  devel- 
opers. Thking  a slightly  different  perspective,  nearly  66  percent  of  the  total  issues  were  identified  by 
people  preparing  test  descriptions  and  procedures.  This  surprising  statistic  highlighted  a major 
lesson  learned  on  the  project  write  the  initial  version  of  system  test  procedures  as  early  as  possi- 
ble. The  scope  of  the  rework  associated  with  such  test-procedures- related  problem  reports  is  nar- 
rower because  test  procedures  are  written  relatively  early  in  the  development  cycle,  when  the 
immature  system  is  still  of  manageable  size. 

Loesh  focused  on  data  collected  in  the  areas  of  system  growth  as  a function  of  the  number  of 
changes  applied  to  the  system  and  of  testing  and  test  strategies.  Of  the  approximately  2100  SPFRs 
generated  to  date,  he  stated  that  the  majority  have  been  submitted  in  the  system  integration  testing 
phase  (40  percent)  and  in  the  CSCI  integration  phase  (18  percent).  When  JPL  observed  through  its 
metrics  program  that  only  9 percent  of  the  errors  were  uncovered  during  individual  CSCI  testing, 
this  type  of  activity  was  shortened  because  of  the  lew  return  for  the  effort  invested.  This  allowed 
JPL  to  concentrate  its  effort  on  the  more  revealing  CSCI  integration  and  system  integration  testing 
phases. 

During  system  integration  testing  for  the  Ada  code,  about  5 errors  per  1000  Ada  statements  were 
recorded,  and  there  were  approximately  23  errors  per  1000  carriage  returns.  These  numbers  are 
nearly  one-half  the  rate  typically  found  in  JPL’s  FORTRAN  projects  of  similar  size.  Loesh  stressed 
that  this  result  demonstrated  the  benefit  of  Ada:  product  quality  improvement  is  notable,  even  if 
transitioning  organizations  are  still  struggling  with  extended  schedules  and  higher  costs. 

Loesh  concluded  with  a discussion  of  error  correction  counts  and  the  associated  effort  required, 
observations  relating  to  portability  issues,  tools  used  to  support  JPLs  portability  analysis,  and 
levels  of  risk  involved  to  convert  portions  of  the  RWP  code  to  increase  its  portability.  At  this  point. 
Loesh  stated  that  the  RWP  team,  incorporating  X Windows  and  placing  emphasis  on  designing  for 
portability,  has  achieved  approximately  75-percent  to  90-percent  portability  for  the  future. 

Finally,  he  stressed  that  the  project’s  metrics  analysis  is  considered  preliminary,  with  adjustments 
expected  shortly  when  the  entire  project  is  completed. 


6208-0 


Cow/Smiih 
CSC 
14  of  48 


Agresti  discussed  an  ongoing  research  project  whose  primary  objective  is  to  test  the  hypothesis  that 
Ada  software  quality  factors  can  be  predicted  during  the  design  phase.  Project  goals  include  devel- 
oping a set  of  diagnostic  capabilities  and  determining  ways  to  quantitatively  assess  the  designs  of 
large  systems.  The  technical  approach  taken  was  to  build  multivariate  models  to  estimate  reliabil- 
ity and  maintainability  and  to  examine  the  characteristics  of  the  software  design  itself  as  captured 
in  the  Ada  design  language.  Agresti  emphasized  that  since  this  project  is  still  in  the  early  stages,  the 
results  given  in  this  presentation  were  preliminary. 

The  study  postulated  that  by  analyzing  a software  system’s  static  design  structure,  examining  such 
attributes  as  design  complexity,  coupling,  and  cohesion,  researchers  could  predict  the  quality 
(number  of  enors)  expected  for  a finished  system.  To  test  this  notion,  the  experiment  looked  at  the 
basic  architectural  decision  choices  possible  in  Ada  designs. 

Two  estimation  models  were  developed,  one  for  reliability  and  one  for  maintainability.  To  allow  for 
several  definitions  of  reliability  and  maintainability  and  to  accommodate  a variety  of  measures, 
each  of  the  quality  factors  was  modeled  as  a function  of  several  parameters,  including  design  char- 
acteristics, environmental  factors,  model  parameters,  and  an  error  term.  Design  characteristics  at 
the  architectural  level  are  the  features  extracted  from  the  design  artifact,  such  as  context  coupling 
and  visibility,  while  environmental  factors  include  items  extraneous  to  the  artifact  such  as  the  vola- 
tility of  changes  to  the  software  and  the  reuse  level.  The  error  term  accounts  for  any  unexplained 
variation. 

A simple  notion  of  a static  Ada  architecture  was  used,  in  which  Ada  structures  were  composed  of 
design  units  from  a “parts”  bin  and  design  relations  from  a “connections”  bin.  This  required  care- 
ful consideration  of  the  many  different  types  of  “parts”  and  the  various  kinds  of  “connections" 
possible. 

“Legal”  Ada  compilation  units  were  composed  from  these  parts  and  connections  and  were  used  as 
a framework  to  evaluate  project  data  on  21  Ada  subsystems  from  the  SEL  data  base.  In  particular, 
the  study  looked  at  the  software’s  reliability  (error  counts)  and  maintainability  (time-to- 
isolate-and-fix)  data  tracked  by  the  SEL  In  general,  there  was  good  variability  in  reliability  and 
maintainability,  the  dependent  variables.  In  some  of  the  subsystems,  compilation  units  were  parti- 
tioned into  library  unit  aggregations,  and  the  pattern  of  information  access  throughout  these  units 
was  also  studied. 

As  an  important  part  of  the  study,  the  team  examined  the  number  and  scope  of  declarations  made 
visible  to  the  units  of  a library  aggregation  (via  a “with”  clause),  looking  for  the  effect  that  the  num- 
ber of  imported  and  exported  declarations  may  have  on  a system’s  future  reliability  and  maintain- 
ability. The  basic  idea  was  to  identify  from  where  and  at  what  levels  external  resources  were 
accessed  for  that  unit  aggregation.  As  a result  of  such  analysis,  simple  static  measures  including 
the  number  of  imports,  the  number  of  exports,  and  the  number  of  cascaded  imports  were  compiled 
and  studied. 

Results  from  the  early  modeling  efforts  are  still  preliminary  due  to  a limited  amount  of  data.  How- 
ever. Agresti  provided  initial  findings  for  the  reliability  model  case,  in  which  the  dependent  variable 
is  errors  per  1000  lines  of  code;  the  relevant  variables  are  context  coupling,  visibility,  and  volatility, 
t A good  fraction  of  the  variation  in  error  rate  is  being  explained  [R2  = 0.72  (adjusted)];  context 

coupling  and  change  rate  are  significant  in  explaining  the  variation. 
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Since  system  and  acceptance  testing  errors  might  better  reflect  the  architectural  issues  of  interest 
(i.e.t  the  interunit  relations  captured  in  the  design),  a preliminary  reliability  model  was  determined 
for  errors  recorded  in  these  phases  (unit  testing  error  counts  were  excluded  here).  The  error  rate 
variation  again  seemed  well  explained  [R2  = 0.78  (adjusted)].  The  context  coupling  and  visibility 
design  characteristics  contributed,  but  the  nugor  environmental  factor  was  reuse;  custom  code 
was  a strong  indicator  in  explaining  the  variations  in  the  data. 

Agresti  related  that  the  team’s  early  results  in  formulating  estimating  models  for  reliability  and 
maintainability  have  been  encouraging,  and  that  they  look  forward  to  exploring  additional  hypothe- 
ses and  to  developing  more  robust  models  that  can  be  subjected  to  validation. 
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SESSION  4 -REUSE 

Session  4 addressed  software  reuse.  Sharon  Waiigora  of  Computer  Sciences  Corporation  chaired 
the  session  and  gave  some  opening  remarks  on  the  reuse  issue.  One  of  the  biggest  challenges  facing 
software  engineers  and  managers  today  is  the  ever-increasing  demand  to  build  larger,  more  com- 
plex systems  with  more  limited  resources.  If  the  current  trend  continues,  it  is  expected  that  both 
adequate  funding  and  qualified  people  will  become  more  scarce  in  the  future.  Industry  leaders  are 
looking  to  software  reuse  to  help  meet  this  challenge.  Some  believe  that  axle  reuse  is  the  answer, 
while  others  believe  that  reuse  can  be  maximized  only  through  changes  to  the  early  phases  of  the  life 
cycle.  The  three  speakers  in  this  session  addressed  three  aspects  of  reuse.  Pablo  Straub  of  the 
University  of  Maryland  discussed  the  need  to  improve  specifications  so  that  designers  are  free  to 
create  reusable  components.  The  second  speaker.  Rush  Kester  of  Computer  Sciences  Corpora- 
tion, described  an  effort  to  characterize  successful  Ada  code  reuse  in  the  SEL  environment  as  a 
basis  for  developing  guidelines  for  creating  reusable  components  in  the  future.  The  final  speaker  in 
the  session  was  Don  Reifer  of  Reifer  Consultants,  Incorporated;  his  talk  addressed  reuse  metrics 
and  the  resulting  lessons  learned. 

Bias  and  Design  in  Software  Specifications 

Pafeto-Straub 

Specification  reuse  has  been  recognized  as  a key  to  achieving  significant  increases  in  software 
reuse.  However,  specification  reuse  can  be  difficult  because  specifications  are  often  subliminally 
tied  to  particular  implementations.  Straub  addressed  this  problem  by  studying  implementation 
bias,  or  overspecification,  which  may  be  translated  as  the  tendency  for  £ specification  to  implicitly 
direct  the  details  of  an  implementation.  His  study  produced  a classification  of  requirements  whose 
goal  is  to  define  a framework  to  explain  the  nature  of  implementation  bias.  The  theory  yields  a 
precise  definition  of  bias  and  demonstrates  that  despite  efforts  to  the  contrary,  bias  is  inherent  in 
specifications. 

Unlike  the  canonical  software  life  cycle  in  which  "specification”  appears  as  only  'he  first  phase, 
Straub  chose  to  regard  the  product  of  each  phase  as  the  specification  for  the  next  phase.  Straub’s 
successive  refinement  of  the  original  specification  (or  “staged  specification”  approach),  and  the 
inherent  potential  for  introducing  new  or  expanding  existing  errors  at  each  subsequent  step,  high- 
lighted the  need  for  high  quality,  abstract  (general),  and  complete  specifications.  Other  desirable 
qualities  are  that  specifications  be  consistent  correct  reusable,  and  traceable. 

Straub  explained  that  typically  the  rule  to  avoid  overspecification  or  bias  has  been.  "Specify  what 
the  system  should  do,  not  how  to  do  it”  But  he  also  stated  that  keeping  these  terms  clear  in  every- 
one’s mind,  especially  through  successive  phases,  isn’t  always  easy  and  that  the  result  of  such  confu- 
sion is  often  the  inverse  of  implementation  bias.  i.e„  underspecification.  Underspecification  in  turn 
leads  to  assumptions  about  the  final  product  or  may  introduce  other  errors  into  the  software. 

To  address  the  over-versus-undcr  specification  problem,  Straub’s  study  defined  a framework  for 
classifying  the  requirements  in  a specification.  He  discussed  interrelationships  among  the  many 
elements  of  this  classification  scheme,  stressing  that  among  other  things,  an  analyst’s  aim  should  be 
to  produce  specifications  free  of  extraneous  attributes  (those  arising  from  misconceptions)  or  im- 
posed attributes  (those  resulting  from  a restriction  in  method  or  language).  Otherwise,  the  conse- 
quence of  bias  is  that  the  solution  adopted  is  not  the  optimal  one.  Additionally,  he  stated,  bias 
cannot  be  completely  eliminated;  as  long  as  there  are  nonexplicit  requirements,  which  is  always  the 
case  because  specifications  are  rarely  complete,  there  will  be  a potential  for  bias. 
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Other  considerations  need  future  effort:  the  term  requirement  must  be  formally  defined:  a method 
to  identify  bias  must  be  developed;  and  a formalism  to  write  specifications  with  attributes,  such  as 
the  origin  of  the  requirement,  must  be  devised. 

Another  side  effect  of  this  research  is  establishing  a relationship  between  bias  and  software  de- 
fects. Errors  can  be  related  to  fictitious  requirements;  and  faults  can  be  related  to  bias,  where  bias 
is  a minor  fault  that  doesn’t  make  the  system  unacceptable  but  does  make  it  nonoptimal.  Within 
this  context,  failure  can  be  linked  with  inefficiency. 

Future  efforts  will  focus  on  testing  these  ideas  by  measuring  bias  in  a specific,  sizeable  project  and 
on  exploiting  the  relationship  between  errors  and  bias. 


Co*er/Smith 


Ada  Reuse  Analysis  and  Representation  at  the  Software  Engineering  Laboratory  (SEL) 

RushJtetec 


The  focus  of  this  presentation  was  a description  of  graphical  representations  and  analysis  tech- 
niques developed  to  study  the  reuse  of  Ada  source  code  across  1990  Ada  components  in  the  SEL. 
Reuse  has  been  an  important  pan  of  the  culture  in  Goddard’s  Flight  Dynamics  Division  (FDD) 
environment  throughout  its  entire  history  because  of  the  assumption  that  there  are  economic  bene- 
fits directly  related  to  the  amount  of  code  reused  without  change.  With  high-level,  verbatim  reuse, 
systems  can  be  delivered  sooner  and  at  lower  cost;  can  be  improved  incrementally;  and  are  more 
reliable.  One  motivation  for  this  study  was  to  understand  the  effect  of  introducing  Ada  and 
object-oriented  design  (OOD)  in  the  SEL,  which  significantly  increased  the  potential  for  code 
reuse.  A high  degree  of  reuse  was  expected  due  to  the  nature  of  Ada  and  OOD  and  to  the  clear 
focus  of  the  FDD  application  domain;  however,  the  goai  of  this  study  was  to  confirm  this  hypothesis 
through  objective  techniques. 

Kester  stated  that  this  Ada  reuse  study  was  still  in  the  “understanding  and  characterizing  stage"  of 
the  SEL’s  process  improvement  paradigm.  The  primary  goals  of  the  current  phase  of  the  study  were 
to  determine  the  patterns  and  trends  of  reuse  and  to  understand  the  characteristics  that  distinguish 
the  reused  from  the  nonreused  components.  Secondary  goals  were  to  identify  candidate  compo- 
nents for  a reuse  library,  identify  the  applicable  domain  of  a component’s  reusability  within  the 
environment,  and  address  some  reuse-related  configuration  management  issues. 

Kester  related  that  the  FDD  environment  is  mostly  FORTRAN-oriented.  and  that  FORTRAN 
projects  do  emphasis  reuse  but  have  not  been  so  successful  as  the  Ada  projects.  He  illustrated  this 
by  citing  that  some  of  the  recent  Ada  systems  are  approaching  100  percent  total  reuse,  with  80  to 
90  percent  reuse  without  change. 

Kester  presented  a summary  of  six  types  of  analysis  reports,  many  of  them  graphical  representa- 
tions, that  were  used  in  the  study.  These  reports  identified  producer  and  consumer  projects: 
showed  the  steady  increase  in  reuse  from  one  generation  of  an  application  to  another;  highlighted 
component  lineage,  both  forward  and  backward;  identified  the  granularity  of  reuse  for  a particular 
project;  depicted  the  level  of  functionality  that  was  being  reused  (i.e.,  just  a single  component  or  an 
entire  branch  of  the  call  tree);  and  finally,  reflected  coupling  between  various  componei/s  through 
the  Ada  compilation  order. 

Some  reuse  patterns  and  trends  have  been  observed.  Initially,  application-independent  compo- 
nents, mostly  utilities,  were  reused;  now  most  reused  components  reflect  the  flight  dynamics 
domain.  Also,  Ada  “generics"  had  significant  impact  on  the  amount  of  reuse  without  changing 
components,  and  OOD  significantly  improved  modularity  and  allowed  component  reuse  from  one 
project  to  another. 

Future  efforts  will  include  developing  guidance  for  improving  the  way  software  is  designed  to  in- 
crease reuse  to  promote  further  economic  benefit  and  investigating  the  characteristics  that  distin- 
guish reusable  components. 
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Reuse  Metrics  and  Measurement— A Framework 
Donald  Rcifer 

Reifer  reported  on  the  culmination  of  3 years’  work  in  reuse  metrics  conducted  by  a national  team 
of  40  firms  under  the  auspices  of  the  Joint  Integrated  Avionics  Working  Group  (JIAWG),  chartered 
by  Congress  to  achieve  commonality  in  avionics  systems  across  all  new-generadon  aircraft.  In  this 
effort  a shared  technology  base  is  being  used  to  achieve  high  degrees  of  reuse  on  major  programs. 

Reuse  efforts  over  the  past  20  years  have  been  disappointing,  in  part  because  many  cultural  consid- 
erations impede  government  programs.  Examples  of  these  considerations  include  the  need  for 
financial  incentives  for  contractors  to  reuse  software,  the  lack  of  a functioning  reuse  identification- 
insertion-maintenance  framework  in  the  software  industry,  and  the  lack  of  a high-level  government 
advocate  to  champion  reuse.  Despite  these  difficult  challenges  to  achieving  effective  reuse.  Reifer 
stated  that  the  advent  of  systematic  reuse  via  an  objected-oriented  paradigm  involving  representa- 
tions. languages,  and  technologies  holds  promise  for  the  future.  The  significance  of  metrics  related 
to  reuse  will  likewise  grow  dramatically.  These  metrics  will  be  used  to  govern  fees  in  contracts,  to 
determine  the  efficiency  and  effectiveness  of  libraries,  and  to  judge  the  quality  of  reusable  software 
objects  against  minimum  standards  of  acceptability. 

Reifer  reiterated  that  metrics  are  key  to  achieving  successful  reuse,  but  that  for  these  metrics  to  be 
applied  effectively,  they  must  be  compatible  with  DOD  processes,  easy  to  collect  and  understand, 
objective  and  unbiased  in  nature,  predictive  of  the  future,  and  incur  a minimum  cost  for  measure- 
ment. He  indicated  that  the  JIAWG  is  working  in  several  areas  to  define  such  comprehensive, 
quantitative  metrics,  and  he  discussed  two  topics  particularly  important  to  the  participants:  the 
acquisition  ratio  and  the  reuse  ratio.  The  object  acquisition  ratio  is  a weighted  average  involving 
the  number  of  reusable  software  objects  (RSOs)  acquired  per  collection  in  relation  to  other  attrib- 
utes of  the  collection.  ["RSOs”  were  defined  as  life-cycle  products  developed  to  be  reused  (such  as 
designs,  algorithms,  code,  and  test  cases)  and  a "collection”  was  defined  as  a homogeneous  group- 
ing of  clustered  objects  (such  as  test  cases).]  Similarly,  the  object  reuse  ratio  involves  the  number  of 
reused  objects  in  a collection  as  a ratio  of  the  total  number  of  objects  in  that  collection,  the  number 
of  collections,  and  a weighting  factor  for  each  collection. 

The  study  found  that  the  cost  of  packaging  reusable  objects  varied  from  10  percent  (for  limited 
reuse  packaging)  to  36  percent  (for  extensive  reuse  packaging),  when  a reuse-insertion  infrastruc- 
ture was  already  operational.  Costs  would  be  higher  without  such  a functioning  infrastructure. 
However,  the  benefits  gained  from  reuse  ranged  from  20-percent  savings  (for  planned  reuse)  to 
60-percent  savings  (for  optimized,  domain-specific  reuse).  Ratios  were  also  used  to  obtain  a 
multiple-instance  reuse  economic  model  to  look  at  cost  benefits  across  multiple  deliveries  to  amor- 
tize the  cost  within  and  across  projects.  The  group  continues  to  refine  this  model  to  predict  cost 
savings  as  a function  of  the  number  of  reuse  instances  and  to  indicate  the  breakeven  point.  Quanti- 
fying and  analyzing  software  quality  factors,  such  as  correctness  and  testability,  and  defining  simi- 
lar quality  criteria  for  large  reuse  libraries  are  additional  areas  of  importance  to  the  JIAWG. 

Because  of  the  wide  range  of  metrics  efforts  described,  Reifer  stated  that  reuse  across  143  com- 
pleted Ada  projects  has  increased  from  10-percent  nominal  in  10  application  domains  in  1987  to 
21-percent  nominal  this  year.  To  further  improve  these  reuse  percentages,  the  JIAWG  is  attacking  a 
number  of  nontechnical  barriers  that  inhibit  the  effective  use  of  the  technology  base. 

Based  on  this  strong  track  record.  Reifer  concluded  that  the  JIAWG  is  not  just  studying  reuse — it  is 
doing  reuse. 
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SESSION  5-PROCESS  ASSESSMENT 

Session  5 was  chaired  by  Rose  Pajerski  of  the  Goddard  Space  Flight  Center.  The  presentations  in 
this  session  focused  on  software  process  assessment  and  the  need  to  have  an  understanding  of  the 
local  environment  before  an  attempt  is  made  to  improve  any  part  of  the  development  process.  The 
first  presentation,  given  by  Kyle  Rone  of  IBM,  covered  the  entire  software  life  cycle.  The  other  two 
presentations  in  the  session  involved  more  specific  parts  of  the  process.  Amrit  Goel  of  Syracuse 
University  discussed  process  assessment  activities  in  the  specifications  phase,  while  John  Kelly  of 
NASA/JPL  addressed  the  implementation  phase. 

Cost  and  Quality  Planning  for  Large  NASA  Programs 

Kite.  Rons 

Building  large,  complex  programs  is  very  difficult,  but  earlier  and  better  planning  can  help  mini- 
mize or  avoid  some  typical  problems.  Rone’s  presentation  addressed  how  to  do  planning  for  cost 
and  quality  on  such  large  programs  as  the  Space  Station  and  Earth  Observing  System  (EOS),  which 
require  very  particular  and  careful  planning. 

Several  essential  considerations  are  key  to  successful  planning:  (1)  ensuring  a compliant  product, 
(2)  generating  the  product  within  budget  and  within  schedule,  and  (3)  producing  a product  with  the 
appropriate  quality  level.  These  requirements  must  be  integrated  and  planned  concurrently  and 
consistently  across  releases. 

Rone  stressed  that  once  models  of  an  organization’s  process  have  been  defined  and  tested,  manag- 
ers must  be  able  to  calibrate  their  models  to  reflect  process  changes,  identify  project  considerations 
different  from  experiences  on  previous  projects  that  may  also  induce  changes  in  the  models,  and 
periodically  revise  the  set  of  management  and  planning  techniques  to  reflect  their  environment 
more  realistically. 

As  an  illustration  of  the  evolution  and  refinement  of  IBM’s  estimating  models.  Rone  reported  that 
size  estimates  for  Shuttle  flight  software  progressed  from  +11  percent  using  the  early  model  to 
-6  percent  using  the  middle  model,  to  only  -2  percent  on  later  missions.  More  astonishingly, 
ground  system  estimates  came  within  1 percent  of  actuals  overall.  Discrepancy  report  data  from 
the  Shuttle  were  also  plotted  over  time  and  produced  a well-defined  Rayleigh  curve.  These  curves 
were  used  to  estimate  error  estimates  across  the  process. 

Given  a set  of  models  that  truly  reflected  the  processes  of  the  environment,  a methodology  was 
followed  for  doing  both  software  cost  engineering  and  software  life-cycle  quality  management.  For 
cost  engineering,  user  requirements  were  broken  inw  a set  of  functions  and  the  system  size  was 
estimated.  Estimation  models  were  used  to  predict  the  labor  effort  phase  it  across  time  using  a 
Rayleigh  curve,  and  develop  the  resultant  schedules.  Other  costs,  such  as  subcontractors  and  over- 
head, were  also  added,  resulting  in  a project  cost  plan.  Periodic  feedback  and  replanning  allowed 
requirements  changes  to  be  incorporated  and  operational  increments  to  be  factored  into  the  plan. 
Rone  stressed  the  critical  importance  of  measurement  as  the  cornerstone  of  their  assessment  and 
replanning  cycle.  A similar  process  was  followed  for  software  life-cycle  quality  management 

Rone  cited  an  asymptotic  curve  relating  the  product  error  rate  to  the  percent  of  the  project  budget 
spent  on  independent  testing  (IV&V)  as  a particularly  beneficial  result  of  planning  based  on 
metrics.  It  was  found  that  spending  much  below  10  percent  on  IV&V  yielded  an  unsatisfactorily 
high  error  rate.  Above  that  basic  figure,  the  optimum  percent  spent  depends  on  the  quality 


requirements  (criticality)  of  the  particular  project  For  less  critical  projects  such  as  tools  and  mis- 
sion control  systems,  10  to  20  percent  yields  good  quality  (one  error  per  KSLOC)  for  the  least  expen- 
diture, For  highly  critical  projects  such  as  a Shuttle  model  80  percent  is  spent  for  IV&V  to  reach  an 
error  rate  of  only  0. 1 errors  per  KSLOC.  Spending  more  than  that  is  counterproductive,  since  small 
gains  in  quality  require  enormous  deltas  of  expenditure. 

The  software  tool  “Squeeze”  supported  Rone’s  team  in  performing  software  cost  and  quality  engi- 
neering for  this  study  and  took  into  account  such  project  characteristics  as  requirements,  com- 
plexity, size,  and  criticality,  as  well  as  process  and  environmental  characteristics,. 

Rone  concluded  that  managers  must  plan  and  continually  replan  and  that  well-developed  metrics 
are  an  essential  element  of  this  cycle.  The  essence  of  metric  management  is  to  demand  that  meas- 
urement be  used  to  assess  process  changes  and  to  provide  a more  disciplined  framework  necessary 
for  effectively  managing  large  projects. 
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Effect  of  Formal  Specifications  on  Program  Complexity  and  Reliability: 

An  Experimental  Study 
Amrit  God 

The  objectives  of  Goel’s  experimental  study  were  (1)  to  investigate  the  effect  of  using  formal  specifi- 
cations on  project  productivity,  reliability,  and  complexity  and  (2)  to  compare  the  results  with 
project  versions  developed  from  informal  specifications.  Formal  Z specifications  were  developed 
from  informal  specifications  for  the  NASA  Launch  Interceptor  Program  (UP).  These  formal  spec- 
ifications were  then  used  to  develop  three  independent  versions  of  LLP  in  “C”.  Each  study  version 
was  tested  against  a set  of  54  test  cases  from  a previous  experiment  involving  UP  and  was  also 
executed  for  1 million  test  cases  to  simulate  operational  testing. 

Goel  presented  a table  comparing  program  metrics  from  three  different  programmers  writing  “C" 
code  from  Z specifications  \ ersus  three  other  programmers  writing  Ada  code  from  informal  speci- 
fications. Metrics  included  the  number  of  source  lines  of  code  each  produced,  the  number  of  com- 
ment lines,  and  the  system  complexity  (a  combination  of  internal  and  external  complexity,  where 
the  external  complexity  measured  module  inter-relationships).  The  “C"  programmers  produced 
source  lines  of  code  ranging  from  373  to  669,  with  system  complexity  ratings  ranging  from  53  to  81, 
while  the  Ada  programmers  produced  source  lines  of  code  ranging  from  691  to  85L  with  system 
complexity  ratings  ranging  from  297  to  334. 

Productivity  numbers  were  examined,  and  consideration  was  given  for  the  time  that  some  of  the 
study  team  programmers  needed  to  learn  the  Z formalisms  and  to  write  the  Z specifications.  Goel's 
analysis  of  development  effort  figures  indicated  that  a considerable  amount  of  design  work  had 
actually  occurred  during  the  specification  phase,  somewhat  skewing  the  analysis.  The  number  of 
errors  (not  including  compilation  errors)  found  by  the  programmers  was  also  tracked.  Program- 
mers A and  C,  who  had  written  their  own  Z specifications,  found  all  errors  in  the  development  and 
unit  testing  phase,  while  programmer  B found  most  errors  in  the  later,  functional  testing  phase. 

Z specifications  were  judged  to  be  helpful  in  several  areas:  certain  ambiguities  were  resolved  by 
looking  at  the  problem  more  formally;  it  was  possible  to  express  some  invariant  properties  of  the 
system  more  dearly;  and  some  types  of  analytical  faults  were  avoided  as  a direct  result  of  using  very 
formal  spedfications.  The  use  of  Z specifications  also  exploited  the  repetitiveness  of  certain  launch 
conditions;  this  was  helpful  in  designating  functional  groupings  for  design  and  testing. 

Goel  drew  the  following  conclusions  from  this  study; 

• Use  of  Z spedfications  was  clearly  helpful  in  reducing  errors. 

• Based  on  a few  metrics,  it  appears  that  the  complexity  of  code  developed  from  Z spedfi- 
cations  was  lower. 

• The  total  effort  involved,  including  learning  Z formalisms  and  developing  the  formal 
spedfications,  was  comparable  to  that  for  developing  versions  from  informal  specifica- 
tions. 

However,  Goel  stressed  that  this  was  a very  small  experiment  and  did  not  provide  conclusive  evi- 
dence about  the  superiority  of  formal  specifications  over  informal  ones.  Further  work  is  necessary 
to  explore  the  feasibility  and  usefulness  of  Z for  large  problems  (scalability)  and  to  investigate  the 
reusability  of  such  formal  spedfications. 
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An  Analysis  of  Defect  Densities  Found  During  Software  Inspections 

John-KttlY 

Kelly  presented  results  of  a 3-year  effort  at  JPL  to  analyze  data  assessing  the  value  of  software  in- 
spections.  JPL  inspections  are  detailed  technical  reviews  performed  on  intermediate  engineering 
products  (in-phase  reviews);  they  are  highly  structured  and  well  defined,  carried  out  by  a small 
group  of  peers,  and  controlled  and  monitored  through  metrics  and  checklists.  The  203  software 
inspections  involved  in  the  study  represented  approximately  5500  hours  of  worktime  and  included 
examinations  of  requirements,  architectural  design,  detailed  design,  source  code,  test  plans,  and 
test  procedures. 

Inspection  elements  tailored  to  the  JPL  environment  included  participants  and  team  composition, 
training,  and  support  documentation.  In  addition,  a short,  optional  “third-hour"  phase  was  some- 
times used  during  the  inspection  meeting  to  clear  up  discrepancies  and  to  discuss  possible  fixes  for 
the  defects  found.  Approximately  five  team  members  were  used  for  an  inspection,  with  a total  staff 
time  of  around  28  hoars.  Input  to  the  process  generally  included  about  35  pages  of  documentation 
or  code,  and  the  output  from  the  process  was  about  35  pages  of  documentation  or  code  less  4 major 
defects  and  about  13  minor  defects  (where  a major  defect  was  defined  as  one  that  would  cause  the 
system  to  fail  or  to  miss  a requirement;  minor  defects  were  all  other  nontrivial  defects). 

One  major  metric  used  to  monitor  and  control  inspections  was  the  number  of  defects  found  per 
page.  Defects  found  per  page  versus  pages  per  inspection  tapered  off  as  more  pages  were  covered 
in  a fixed  2-hour  inspection.  It  was  determined  that  to  maintain  quality,  limiting  an  inspection  to  a 
maximum  of  40  pages  optimized  the  effort.  Another  major  metric  was  defect  density  versus 
inspection  type.  Analysis  showed  that  a significantly  higher  number  of  defects  existed  in  require- 
ments than  in  code,  and  a significantly  higher  number  of  defects  existed  in  high-level  test  plans  than 
in  low-level  test  plans.  This  indicated  that  the  m^jor  quality  problem  at  JPL  during  code  develop- 
ment is  the  writing  of  requirements.  Predictive  modeling  of  the  defect  density  versus  the  inspection 
type  showed  an  exponentially  decreasing  density  of  defects  as  the  code  level  was  approached. 

Kelly  reported  that  generally  about  1/2  hour  was  required  to  fix  a defect  found  early  in  the  life  cycle, 
compared  with  5 to  17  hours  to  fix  a defect  found  during  test.  Even  if  the  time  to  find  the  defect  is 
added  to  the  1/2  hour  time  to  fix,  it  is  much  cheaper  to  fix  the  defect  early  than  it  is  to  fix  it  during 
testing.  One  rationale  is  that  defects  multiply  themselves  as  the  phases  move  on,  clearing  a “tree"  of 
defects.  Inspection  allows  the  defect  to  be  found  at  its  root  node  and  to  be  fixed  there,  which  is 
much  less  expensive. 

Major  conclusions  of  this  study  were  the  following: 

• A wider  spectrum  of  errors  was  found  using  inspections  than  was  found  with  previous 
techniques;  the  most  prevalent  types  of  errors  were  in  the  areas  of  clarity,  logic,  com- 
pleteness, consistency,  and  functionality. 

• Increasing  the  number  of  pages  in  a single  inspection  decreases  the  number  of  defects 
found,  with  about  40  pages  being  optimal. 

• The  highest  defect  density  was  observed  during  requirements  inspections. 

• Larger  team  sizes  (6  to  8 people)  seem  to  be  justified  for  higher  level  inspections  (re- 
quirements inspections),  because  they  provide  a broader  viewpoint  and  seem  to  provide 
an  increased  defect  finding  capability. 
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• Code  inspections  were  superior  over  a single-person ’code  audit.  However,  future  re- 
search may  be  able  to  increase  the  return  on  investment  in  code  inspections  through  the 
use  of  automated  support 
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1990  SOFTWARE  ENGINEERING  WORKSHOP — PANELS 

TVvo  panels  were  added  this  year  to  the  Software  Engineering  Workshop.  The  first  panel  addressed 
how  to  establish  a successful  measurement  program.  The  second  panel  was  designed  to  stimulate 
discussion  by  highlighting  the  opinions  of  experts  in  the  field  of  software  engineering. 


PANEL  1- EXPERIENCES  IN  1MPLEH 


<G  AN  E 


MEASUREMENT  PROGRAM 


The  first  panel  of  the  workshop  involved  discussion  by  four  distinguished  panelists  who  have  done 
extensive  work  in  areas  of  software  engineering  such  as  measurement,  experimentation,  and  data 
collection:  Michael  DaskalantonakisofMotorola,Bob  Grady  of  Hewlitt- Packard,  Ray  Wolverton 
of  Hughes  Aircraft  Company,  and  Mitsuru  Ohba  of  IBM/Japan.  Frank  McGarry  of  the  Goddard 
Space  Flight  Center  moderated  the  panel.  Each  panelist  was  asked  to  relate  his  experiences  in 
defining  and  working  with  a successful  measurement  program  and  to  convey  his  experiences  to 
other  practitioners  who  may  want  to  implement  such  a facility  in  their  own  organizations.  Some  of 
the  considerations  of  interest  are  what  does  the  measurement  program  look  like,  what  obstacles 
had  to  be  overcome,  what  were  the  costs  and  benefits,  and  what  is  the  long-range  outlook. 


Michael  Daskalantonakis 
Motorola 


Overview  of  the  Motorola  Software  Metrics  Program 

Beginning  in  1988,  a company-wide  program  on  software  metrics  was  established  at  Motorola.  The 
primary  emphasis  of  this  program  has  been  establishing  an  organizational  infrastructure  to  sup- 
port the  use  of  software  measurement  technology  across  the  product  groups  and  to  ensure  that  they 
use  the  measurement  technology  effectively  to  obtain  the  maximum  benefits.  Three  important  ac- 
tivities have  been  key  in  Motorola’s  successfully  launching  its  metrics  program:  (1)  establishing  a 
Metrics  Working  Group  (MWG)  across  all  Motorola  business  units  to  design  a standard  set  of 
metrics;  (2)  creating  a Metrics  Users  Group  (MUG)  to  provide  a focus  for  implementing  software 
metrics  on  the  projects,  to  look  into  automation  (tools),  and  to  serve  as  a forum  for  exchanging 
ideas;  and  (3)  identifying  metrics  champions  within  the  corporate  research  and  development 
groups  3nd  within  business  units.  Important  2-day  training  workshops  were  held  to  promote  the 
effective  use  of  metrics  on  the  projects,  and  followup  consultation  was  also  provided.  A minimum 
set  of  software  metrics,  now  required  by  the  Motorola  Quality  Policy  for  Software  Development, 
was  defined  and  additional  metrics  were  used  by  projects  as  necessary. 

Daskalantonakis  summarized  the  overall  philosophy  of  the  measurement  program  at  Motorola: 
The  goal  is  not  measurement.  The  goal  is  improvement  through  measurement,  analysis,  and  feed- 
back. 


Obstacles  That  Had  To  Be  Overcome 

While  its  program  is  currently  considered  a success,  Daskalantonakis  discussed  two  obstacles 
Motorola  experienced  in  establishing  its  measurement  infrastructure. 

• A “system”  had  to  be  set  up  to  collect  the  software  metrics  data  and  to  analyze  the  data 
for  process  improvement  This  required  obtaining  tools  to  automate  metrics  collection 
and  analysis  activities. 
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• Software  developers  and  managers  had  to  be  convinced  and  reassured  that  metrics  data 
would  not  be  inconsistently  reported,  misused,  or  misinterpreted  and  that  the  collection 
effort  cost  would  not  unduly  burden  projects.  "D-aining  and  additional  guidelines  are 
being  used  to  address  these  culturally  related  issues. 

Costs  of  the  Metrics  Program 

There  have  been  costs  to  establish  the  program  as  well  as  operational  costs.  Program  costs  include 
man-hours  for  MWG  and  MUG  meetings  and  tool  development  costs.  Operational  costs  for  the 
metrics  program  from  sample  Motorola  divisions  have  been  1 percent  or  less  of  resources.  Post- 
release metrics  costs  have  been  insignificant  compared  with  benefits,  but  more  work  needs  to  be 
done  on  automation.  In  general,  Daskalantonakis  related,  the  overall  cost  is  considered  to  be  ac- 
ceptable and  justified. 

Benefits  to  Date 

The  benefits  realized  by  Motorola  have  clearly  justified  its  investment  in  a metrics  program: 

• Software  Quality  Awareness:  People  have  started  thinking  about  the  software  process, 
tbe  quality  of  that  process,  and  the  quality  of  the  resulting  product.  Metrics  data  have 
helped  the  understanding  of  several  problems,  have  demonstrated  the  severity  of  these 
problems,  and  have  spurred  action  toward  solutions. 

• Establishing  Baselines  and  Goals:  Metrics  have  helped  establish  baselines  to  identify 
current  progress  and  to  set  up  aggressive  goals,  leading  to  significant  quality  and  pro- 
ductivity improvements. 

It  is  not  the  metrics  themselves  that  have  made  the  difference,  but  the  actions  taken  as  a result  of 
looking  at  the  data;  the  benefits  stem  from  analyzing  the  data  and  feeding  back  information  to  im- 
prove the  process.  In  addition,  there  are  many  indirect  benefits,  such  as  improved  acceptance  crite- 
ria and  improved  schedule  estimation. 

Long-Range  Benefits  Expected 

•Daskalantonakis  concluded  with  a discussion  of  several  areas  expected  to  yield  long-term  benefits 
to  Motorola  as  a consequence  of  its  metrics  initiatives: 

• Learning  From  Mistakes:  Future  problems  can  be  avoided  by  looking  at  metrics  data 
from  previous  projects  and  learning  from  their  mistakes. 

• Improvement  in  Customer  Satisfaction:  Improved  product  quality  will  promote  im- 
proved customer  satisfaction. 

• Cost  Reduction:  Improved  quality  and  reduced  rework  cost  will  lead  to  significant  cost 
reductions.  In  addition,  resources  will  be  freed  up  for  new  software  development  work. 

• Cycle  Time  Reduction:  Productivity  improvement  is  expected  to  reduce  the  cycle  time, 
allowing  products  to  reach  the  market  in  a timely  manner. 

Again,  Daskalantonakis  emphasized  that  metrics  can  only  highlight  the  problems  and  suggest 
ideas  as  to  what  can  be  done.  It  is  the  action  taken  that  brings  the  benefits. 
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Bob  Gradv 
Hewlitt-Packard 


Overview  of  the  Hewlitt-Packard  Software  Metrics  Program 

Hewiitt-Packard  began  its  metrics  program  about  7 years  ago  as  part  of  an  overall  productivity  and 
quality  improvement  initiative.  The  starting  point  was  a standard,  although  primitive,  set  of  met- 
rics consisting  of  code  volume,  effort,  and  defect  definitions.  Over  the  past  7 years,  a supporting 
infrastructure  has  been  put  in  place,  with  activities  at  the  corporate  level  to  support  the  metrics 
effort  and  to  provide  divisional  process  assessments.  Most  of  the  productivity  and  quality  improve- 
ment programs  are  brought  together  at  the  product  group  level.  At  the  division  level,  metrics  efforts 
are  well  supported  through  the  quality  managers  and  productivity  managers;  the  quality  managers 
focus  on  the  quality  of  the  product,  whereas  the  productivity  manager*  focus  on  the  efficiency  of  the 
processes,  lb  support  all  of  these  measurement  efforts,  training  needs  have  been  addressed 
through  an  internal  2-day  metrics  class  that  has  evolved  over  time. 

At  the  corporate  level  two  major  software  goals,  identified  as  “1QX  improvements,”  were  estab- 
lished in  1986.  The  first  goal  was  to  improve  bye  factor  of  10  the  poctrdease  product  defect  density 
over  a 5-year  period.  The  second  goal  was  to  improve  by  a factor  of  10  the  number  of  open  serious 
and  critical  defects,  also  over  a 5-year  period.  This  second  goal  complements  the  first  one.  Both 
were  directly  tied  to  defects,  rather  than  cost,  assuming  that  there  is  an  indirect  relationship  to  cost. 
There  has  been  progress  toward  meeting  both  goals. 

Grady  described  the  Hewlitt-Packard  metrics  program  in  terms  of  a hierarchy.  He  identified  five 
different  stages: 

• Acceptance  of  measurement 

• Availability  of  project  trend  data 

• Use  of  common  terminology  necessary  for  data  comparisons 

• Experimenting  to  validate  the  best  practices 

• Performing  analysis  and  automated  data  collection  with  expert  system  support 

In  the  Hewiitt-Packard  experience,  the  first  three  stages  were  accomplished  relatively  quickly, 
within  the  first  2 years.  Now,  much  time  is  spent  with  experiments  validating  the  best  practices  and 
in  understanding  those  things  that  are  going  right  The  fifth  stage  will  take  more  time  to  achieve. 

Obstacles  That  Had  To  Be  Overcome 

Grady  discussed  four  obstacles  to  establishing  a metrics  program  that  were  more  cultural  than 
technical  in  nature: 

• Perceptions  of  Metrics:  The  basic  perception  had  to  be  overcome  that  metrics  were  fo- 
cused only  on  code  analysis.  In  the  industrial  area,  the  feeling  is  that  metrics  are  driven 
from  business  types  of  practices  in  terms  of  project  tracking.  However,  many  other  items 
such  as  effort,  quality,  and  productivity  projections  versus  actuals  are  also  necessary. 

• Promotion:  Three  separate  Hewlitt-Packard  groups  (top  management,  project  manag- 
ers. and  engineers)  had  to  be  convinced  of  the  benefits  of  metrics  and  had  to  be  shown 
bow  to  use  them  correctly. 


• Tbo-Rapid  Change:  Grady  cautioned  against  the  tendency  to  cry  to  change  things  too 
rapidly  in  what  is  essentially  a slowly  changing  process.  For  example,  the  Hewlitt- 
Packard  metrics  effort  found  the  Goal-Question-Metric  paradigm  useful  in  overcoming 
the  “leap  before  you  look”  syndrome.  Other  drivers  influencing  too-rapid  change  were 
the  “more  is  better”  and  “desperation  for  a breakthrough”  syndromes. 

• Organizational  Changes:  Organizational  changes  cause  problems  in  transitioning  new 
managers  into  an  already-established  metrics  program.  Getting  a baseline  in  place  can 
help  with  this  kind  of  problem. 

What  Wis  Done  Right 

Grady  next  shared  experiences  in  six  areas  that  Hewlitt-Packard  considers  successes  in  its  pro- 
gram: 

• Bottom-Up  Approach  With  Metrics  Council:  A hand-picked  metrics  council  was 
formed,  involving  approximately  20  of  the  more  experienced  midline  managers.  Their 
goal  was  to  identify  a set  of  measures  that  they  felt  would  be  useful  and  meaningful  in 
managing  projects.  This  approach  promoted  “ownership”  and  participation  in  the  pro- 
gram. 

• Started  Small:  The  effort  started  small,  using  the  set  of  three  primitive  metrics  de- 
scribed earlier.  It  is  felt  that  starting  small  and  building  on  that  foundation  was  the  cor- 
rect approach. 

• Creating  an  Environment  for  Reinforcing  Success:  This  involved  (1)  putting  in  place  a 
2-day  training  program  for  all  functional  managers  in  the  company,  with  the  primary 
focus  of  teaching  them  about  the  fundamentals  of  software  development  as  well  as  met- 
rics: (2)  establishing  internal  Software  Engineering  Productivity  Conferences,  with  em- 
phasis on  reporting  practical,  measurable  improvements  in  quality  or  productivity,  and 
(3)  identifying  productivity  managers. 

• Establishing  a Metrics  Gass. 

• Providing  Good  Tool  Support:  A good  set  of  minimal  but  adequate  tools  was  developed 
to  support  the  metrics  collection  effort. 

Costs  and  Benefits 

Grady  conveyed  that  talking  about  specific  costs  and  benefits  misses  the  real  issue:  there  is,  in  fact, 
no  choice.  Those  organizations  that  do  make  the  investment  in  a metrics  program  to  better  under- 
stand and  improve  their  development  process  will  have  a more  informed  basis  for  making  decisions 
and.  thus,  will  have  a competitive  advantage. 

He  cited  a specific  example  of  the  benefits  of  Hewlitt-Packard’s  applying  metrics  to  the  area  of 
failure  analysis.  The  causes  of  defects  were  tracked  and  categorized  into  defect  types  such  as  error 
checking,  logic  implementation,  user  interface,  integration,  software  testing,  standards,  and  data 
definition.  Also  identified  were  the  software  development  phases,  such  as  specifications/ 
requirements,  design,  code,  and  environmental  support  in  which  the  errors  were  occurring.  This 
provided  an  overview  of  the  major  types  of  defects  for  each  division,  allowing  improvement  efforts 
to  be  focused  in  those  areas  with  the  most  errors. 
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^ Future  benefits  are  expected  from  continued  use  of  failure  analysis,  as  well  as  from  continued  pur* 

suit  of  10X  improvement  goals.  Also,  the  long-term  trend  of  reduced  defects  will  have  a significant 
J impact  in  terms  of  shifting  software  effort  from  maintenance  to  new  development. 
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Rav  Walverton 

Hughes  Aircraft  Company.  ITT 

Wolvcrton  presented  an  overview  of  the  ITT  programming  measurement  effort  conducted  from 
1981  to  1986  at  its  Programming  Advanced  Technology  Center.  The  goal  was  to  invest  $100  million 
over  a 5-year  period  and  earn  the  company  a return  of  $1.1  billion  through  productivity  improve- 
ments. Wolverton's  particular  job  was  lo  prove  the  actual  cost  savings.  Programming  measurement 
involved  developing  yearly  baselines  for  progress  comparison,  a project  performance  reporting  sys- 
tem for  management,  forecasting  and  diagnostic  measurement  procedures  for  use  by  project  per- 
sonnel and  an  enhanced  integrated  measurement  system.  Over  a 3-year  period,  information  from 
106  projects  supported  data  collection,  cost  model  development,  and  cost  and  quality  trends  dem- 
onstrations. A lesson  from  this  experience  was  that  undertaking  such  an  ambitious  metrics  pro- 
gram requires  time. 

There  were  many  programming  measurement  objectives,  but  four  were  the  most  prominent: 

• Provide  an  early  warning  of  project  productivity,  quality,  schedule,  or  cost  difficulties 

• Improve  the  development  and  defense  of  competitive  bids 

• Compare  ITT  performance  with  overall  industry  performance 

• Improve  the  allocation  of  resources 

The  measurement  strategy  involved  dividing  responsibilities  between  the  “Programming’*  (metric 
study)  group  and  the  “Unit”  group  (individual  ITT  companies).  The  Programming  group  estab- 
lished and  enhanced  the  methodology  and  tools  and  instructed  the  Units  in  their  use.  The  Units 
established  their  responsibilities  and  collected  and  analyzed  the  data. 

Activities  needed  to  achieve  the  program’s  objectives  included  identifying  baselines  and  leading 
performance  indicators;  formulating  resource  estimating  techniques  and  quality  profiles:  promot- 
ing programmer/manager  development  in  the  metrics  area;  and  conducting  programmer  compe- 
tency and  task  analyses.  Wolverton  indicated  iTT's  priority  focused  on  identifying  leading 
indicators:  they  wanted  an  online  tool  that  would  allow  ITT  world  headquarters  to  determine  at  will 
the  performance  of  any  of  its  world-wide  units. 

The  overall  strategy  to  study  these  performance  factors  involved  the  use  of  an  online  network  meas- 
urement questionnaire.  The  effort  collected  basic  data  (defects,  resources,  and  costs)  and  environ- 
mental data  (requirements,  practices,  and  products)  and  organized  this  information  into  a data 
base.  Univariate  analysis  of  these  data  highlighted  13  major  factors  found  to  affect  programming 
productivity,  quality,  and  cost.  These  factors  included  modem  programming  practices,  program- 
ming personnel  the  organizational  structure,  available  tools,  project  complexity,  and  computer 
availability.  Multivariate  analysis  was  then  performed  by  product  category  to  produce  the  initial 
model 

The  research  identified  strongly  correlated  groups  of  productivity  factors.  Higher  productivity  was 
influenced  by  many  factors,  including  requirements  specifications  factors  (personnel  experience 
and  the  number  of  rewrites  necessary);  higher  usage  of  modem  programming  practices;  higher 
client  experience  and  participation;  lower  staffing  level  and  larger  target  and  development 
computers.  It  was  also  found  that  ail  factors  that  unproved  productivity  also  improved  quality.  In 
addition,  the  presence  of  one  productivity  factor  was  not  sufficient  to  assure  higher  productivity/ 
quality,  but  the  lack  of  that  factor  was  enough  to  guarantee  the  lack  of  high  productivity/low  error 
rates.  Productivity  trends  and  constraints  were  tracked  to  validate  the  results. 


Factors  were  categorized  as  controllable  (such  as  staffing  level  and  experience,  the  development 
computer,  requirements  specifications);  uncontrollable  (such  as  the  target  computer,  timing, 
memory  utilization,  and  the  application  complexity);  and  other  variables  (such  as  incorrect  data, 
newness  of  application/design,  and  documentation  requirements).  This  categorization  grouped 
information  in  terms  that  managers  could  understand.  The  13  productivity  factors  explained  2/3  of 
the  variation  in  productivity;  when  combined  with  developed  statements,  the  13  productivity  fac- 
tors explained  90  percent  of  the  variation  in  the  effort  Knowing  what  can  be  controlled  is  impor- 
tant* this  effort  determined  that  about  1/3  of  the  impact  on  productivity  can  be  controlled. 

One  key  finding  identified  by  Wolverton’s  effort  was  the  relationship  between  productivity  and 
quality,  with  quality  defined  as  “reduced  defects."  The  number  of  testing  defects/KLOC  were 
plotted  versus  a productivity  measure.  A quality  baseline  was  determined  to  be  20  errors/KLOC, 
based  on  the  cumulative  experience  of  the  study;  90  percent  of  the  projects  experienced  this  defect 
level  or  less.  TWo  projects  were  much  higher  in  the  number  of  defects;  in  both  of  these  projects, 
there  was  a great  deal  of  reuse.  Wolverton  cautioned  that  latent  errors  in  untested  reused  code  can 
cause  this  kind  of  result. 

The  Units  reacted  positively  to  the  introduction  of  the  metrics  program,  but  indicated  that  many 
factors  inhibited  its  effective  application  (insufficient  resources  and  no  time  to  collect  data).  How- 
ever, when  one  Unit  achieved  positive  results,  other  Units  became  interested  in  creating  the  same 
environment  Following  initial  negative  feelings  related  to  the  effort  and  time  required  for  acquir- 
ing data  and  testing  proposed  measures,  the  Units  gained  more  confidence  in  the  program  and 
chose  to  continue  an  R&D  phase  of  the  project 
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The  Japanese  generally  believe  that  “the  things  other  people  do  are  the  right  things  to  do.”  In  the 
software  context*  “other  people”  could  mean  other  organizations  within  a division*  other  divisions 
d within  a company,  other  companies  in  the  industry,  or  other  industries.  Ohba  explained  that  this 

belief  promotes  a great  deal  of  learning  from  others,  and  that  software  measurement  and  analysis 
activities  in  particular  have  benefitted  from  such  a perspective. 

Overview  of  Japanese  Measurement  Programs 

The  following  are  the  standard  measurements  used  in  Japanese  software  measurement  programs: 

• Size  (KLOC):  Noncommented  source  lines  of  code,  including  reused  source 

• Productivity:  LOC  per  programmer  month 

• Quality:  Errors  per  KLOC 

These  measures  are  conceptually  the  same  as  those  used  in  the  United  States  and  Europe. 
Obstacles  Overcome 

To  establish  a measurement  system,  there  had  to  be  agreement  on  what  should  be  measured  includ- 
ing 

• What  kind  of  data  should  be  collected 

• How  data  should  be  analyzed  (models,  techniques) 

• How  results  should  be  fed  back  into  the  management  and  development  processes 

The  answers  to  these  questions  vary,  because  no  standard  measures  exist.  This  has  been  a major 
obstacle  for  the  IBM/Japan  program.  Much  time  has  also  been  spent  on  determining  the  methods 
for  data  analysis.  Software  models  are  being  heavily  used  to  estimate  the  number  of  errors  remain- 
ing in  a program  and  to  estimate  the  required  maintenance  effort  However,  because  so  many  com- 
mercial models  are  available,  choosing  the  most  appropriate  models  to  be  used  requires  much 
discussion. 

Program  Cost 

Ohba  discussed  several  activities  a centralized  organization  must  perform  to  establish  an  effective 
measurement  program:  de&ne  measures  and  evaluation  systems,  define  the  data  to  be  collected, 
choose  simple  methods  for  analyzing  data,  develop  tools  for  collecting  and  analyzing  data,  maintain 
the  data  base,  and  provide  education.  In  Japan,  measurement  systems  are  defined  either  by  (1)  a 
central  software  technology  support  group  or  (2)  a quality  assurance  or  equivalent  organization. 
These  groups  define  the  measures  and  the  way  data  are  collected  based  on  de  facto  standards  or  on 
working  papers  from  various  committees.  Data  collection  and  analysis  are  done  by  project  groups, 
not  by  the  centralized  group.  With  all  of  these  required  activities,  an  effective  measurement  pro 
f gram  is  expensive. 
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Benefits  to  Date 


Ohba  stated  that  it  typically  takes  at  least  3 years  to  see  the  changes  or  benefits  from  a measurement 
program  but  by  implementing  a consistent  measurement  system,  IBM/Japan  has  achieved  the 
following  gains: 

• Management  by  quantitative  objectives:  setting  objectives  and  reviewing  achievements 
(e.g.,  software  reliability  growth  estimation) 

• Standard  and  consistent  control  of  the  software  process  by  defining  the  upper  and  lower 
control  limits 

• Incremental  and  continuous  process  improvement  by  setting  an  annual  goal  for  an  orga- 
nization 

As  a result  of  IBM/Japan’s  applying  such  software  measurement  techniques  and 
measurement-based  management  approaches  over  the  last  10  years,  their  defect  rate  has  been  re- 
duced from  5 errors  per  KLOC  to  0.1  error  per  KLOC. 

Long-Range  Benefits  Expected 

IBM/Japan’s  long-range  goal  is  the  optimization  of  its  software  process.  This  would  involve  the 
following: 


• Design  the  best  process  for  a project  based  on  past  experiences 

• Properly  monitor  and  manage  that  process  based  on  quantitative  data  analysis/ 
assessment 

• Reconfigure  the  process  dynamically,  if  needed,  based  on  the  data  analysis  results  and 
experiences 

Ohba  described  the  “software  factory”  as  an  overlapping  set  of  three  elements:  a process,  methods/ 
techniques  used  to  implement  the  process,  and  tools/measurements  to  support  the  methods/ 
techniques.  In  a specific  application  domain,  these  elements  and  their  region  of  intersection  can  be 
standardized.  This  ideal  “software  factory”  is  the  long-term  benefit  that  can  be  achieved  based  on 
this  measurement  system. 
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PANEL  2— SOFTWARE  ENGINEERING  IN  THE  1980S:  MOST  SIGNIFICANT 
ACCOMPLISHMENTS/GREATEST  DISAPPOINTMENTS 

The  second  pane!  of  the  workshop  involved  discussion  by  five  experts  in  the  field  of  software  engi- 
neering. The  first  four  panelists  were  Barry  Boehm  of  the  Defense  Advanced  Research  Projects 
Agency  Information  Science  and  Technology  Office  (DARPA ISTO),  Larry  Druffei  of  the  Software 
Engineering  Institute  (SEI).  Manny  Lehman  of  the  Imperial  Coilege,  and  Harlan  Mills  of  Software 
Engineering  Technology,  Inc.  (SET).  Vic  Basili  of  the  University  of  Maryland  provided  input  as  a 
fifth  panelist  and  also  moderated  the  panel.  Each  was  asked  to  assess  the  progress  of  software 
engineering  in  the  1980s  by  addressing  the  following  questions: 

• What  have  been  the  most  significant  achievements  for  software  engineering  in  the  past 
10  years? 

• What  have  been  the  greatest  disappointments  for  software  engineering  in  the  past 
10  years? 

• What  are  the  objective  or  subjective  criteria  supporting  your  assessments? 

• What  software  engineering  advances  will  make  the  most  significant  contribution  in  the 
next  5 years? 

The  panelists  found  amazing  similarities  in  their  views  of  achievements  and  disappointments  in 
software  engineering  in  the  1980s.  Top  achievements  agreed  upon  by  most  of  the  panelists  included 
the  study  of  process  to  improve  it,  the  evolution  toward  object-oriented  notions,  the  creation  and 
standardization  of  Ada,  the  establishment  and  importance  of  metrics,  the  evolution  of  formal 
methods,  the  creation  of  the  Software  Engineering  Institute,  the  improvement  of  life-cycle  models, 
and  the  availability  of  CASE  tools  in  industry.  However,  there  was  a dark  side  to  many  of  the 
agreed-upon  achievements.  The  disappointments  most  often  cited  were  in  Ada  use  and  Ada  edu- 
cation, CASE  tool  integration  standards  and  use,  and  the  lack  of  understanding  and  implementa- 
tion erf  software  engineering  as  a discipline  in  both  industry  and  education. 

Looking  forward,  the  panel  suggested  the  areas  of  most  progress  in  the  next  decade.  These  include 
the  maturation  of  object-oriented  techniques,  a focus  on  software  architecture  and  the  true  engi- 
neering of  software,  more  use  and  consistency  of  metrics,  and  process  improvement  through  meas- 
urement. 

In  the  lively  question  and  answer  session  that  followed,  the  panelists  conjectured  on  how  to  improve 
future  software.  Barry  Boehm  focused  on  people:  “Double  your  salary  structure,  and  get  rid  of  your 
unproductive  people,”  he  said,  to  the  delight  of  the  audience.  Harlan  Mills  felt  that  we  must  in- 
crease the  use  of  mathematical  and  formal  approaches;  he  said,  “We’re  not  doing  things  the  right 
way,”  but  also  agreed  with  Boehm  that  “We’re  not  using  the  right  kind  of  people.”  Larry  Druffei 
suggested  that  designers  must  build  changeability  into  the  original  design  to  lower  maintenance 
costs  (agreeing  with  Boehm  that  older  software  should  be  “obliterated”  rather  than  maintained). 
Manny  Lehman  further  supported  this  point  by  stating  that,  because  society  is  becoming  more  and 
more  dependent  on  software,  we  must  also  minimize  the  response  time  to  change;  he  said  that  fu- 
ture software  must  be  designed  so  that  “changes  can  be  made  in  real  time,”  in  response  to  the 
changing  world  that  software  supports.  Vic  Basili  offered  that  we  need  to  focus  on  process  improve- 
ment so  that  we  can  “predict  and  control  future  software  development”. 

Detailed  summaries  of  each  panelist’s  presentation  follow. 
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BARRY  BOEHM 


Barry  Boehm,  currently  director  of  the  Defense  Ad- 
vanced Research  Projects  Agency  Information 
Sciences  Technology  Office  (DARPA  ISTO),  was 
formerly  chief  scientist  at  TRW’s  Defense  Systems 
Group.  He  has  authored  several  books  and  won  nu- 
merous awards  in  the  areas  of  software  engineering 
and  software  measurement 


* 


V 


Boehm  summarized  his  list  of  the  primary  disappointments  and  achievements  in  software  engi- 
neering during  the  1980's  as  follows,  noting  that  there  are  correlations  between  them: 


MAJOR  ACHIEVEMENTS 

MAJOR  DISAPPOINTMENTS 

STARS 

STARS 

OBJECT-ORIENTED  METHODS 

METRICS  USAGE 

THE  SOFTWARE  ENGINEERING 

SOFTWARE  ENGINEERING 

INSTITUTE  (SB) 

EDUCATION 

PROJECTIONS  FOR  THE  FUTURE 

CONTINUED  PROGRESS  THROUGH  THE  SS 

MATURATION  OF  OBJECT-ORIENTED  TECHNIQUES 

STARS 

Boehm  stated  that  the  Software  Technology  for  Adaptable  and  Reliable  Systems  (STARS)  program 
had  a promising  start  in  1982,  but,  due  to  a variety  of  factors,  progress  in  the  mid-  to  late-1980s  was 
disappointing.  Now,  however,  Boehm  feels  that  STARS  is  achieving  progress  in  both  the  software 
engineering  process  and  product  areas.  In  1986,  the  structure  for  defining  the  products  sought  by 
STARS  was  revised.  These  products  would  now  be  software  support  mechanisms  focused  pri- 
marily on  tools  but  also  on  software  reuse  and  the  process  by  which  tools  should  be  applied.  STARS 
placed  particular  emphasis  on  choosing  really  well-qualified  people  to  perform  the  wort  Contrac- 
tors were  selected  based  on  demonstrated  ability  and  their  commitment  to  building  a commercially 
viable  software  support  environment.  Today,  companies  such  as  IBM,  Unisys,  Boeing  and  DEC 
are  well  on  the  way  to  producing  the  kinds  of  software  environments  that  support  reuse  and  provide 
good  process  models. 

THE  SOFTWARE  ENGINEERING  INSTITUTE 

Boehm  stated  that  great  progress  was  made  in  software  process  assessment  during  the  1980s.  At 
the  beginning  of  the  decade,  almost  nothing  was  being  done  in  this  area.  The  Air  Force  Aeronauti- 
cal Systems  Division  did  produce  some  good  checklists  for  capability  assessments,  and  a team  from 
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the  SEI  created  the  SEI  questionnaire  for  software  process  self-assessment.  The  thoroughness  of 
the  SETs  assessment  questions  and  the  fact  that  there  is  a bit  of  a carrot-stick  approach  involved 
prompted  people  to  take  software  process  concerns  seriously.  (Carrot  » “here  are  some  things 
thatwill  make  you  better";  stick  * “DOD  will  use  these  issues  from  time  to  time  as  source  selection 
criteria”).  Companies  that  have  taken  these  considerations  seriously  are  now  doing  a much  better 
job  of  producing  software. 

However,  Boehm  feels  the  achievement  in  process  assessment  is  not  yet  fully  consummated.  SEI 
process  maturity  Levels  L 2,  and  3 are  defined  quite  well,  but  the  definition  of  Level  5 is  not  very 
extensive.  Thought  is  needed  on  clarifying  the  ultimate  goal  in  terms  of  a software  organization’s 
process  maturity  expectations. 

One  of  the  biggest  disappointments  for  the  software  engineering  community  was  the  assessment 
results.  Eighty-five  percent  of  the  software  organizations  that  were  involved  in  self-assessment 
came  out  at  the  Level  L or  “chaotic”  stage,  the  lowest  level  of  the  scale.  Analysis  of  these  results 
showed  that  it  is  easy  to  publish  policies  and  standards  and  to  do  good  briefings,  but  it  is  difficult 
and  it  takas  a great  deal  of  commitment  for  a company  to  follow  through  on  these  things  and  really 
use  them  to  produce  better  software.  The  carrot-stick  philosophy  is  helping  to  improve  this  defi- 
ciency. 

METRICS  USAGE 

Boehm  went  on  to  explain  that  one  important  characteristic  of  the  top  levels  of  the  SEI  software 
process  maturity  scale  is  that  one’s  process  be  a measured,  optimized  process.  This  necessitates 
that  a well-defined  metrics  program  be  integral  to  the  software  organization's  manner  of  doing 
business.  One  of  the  biggest  disappointments,  however,  is  the  lack  of  such  metrics  programs 
throughout  the  industry.  In  Boehm’s  opinion,  people  attending  NASA/Goddard’s  Software  Engi- 
neering Laboratory  (SEL)  conferences  during  the  1980s  seem  to  be  the  only  exception  to  this.  Dur- 
ing the  period  of  1979- 1981,  considerable  excitement  and  interest  in  software  metrics  followed  the 
introduction  of  the  Constructive  COst  MOdel  (COCOMO).  but  since  then  relatively  few  organiza- 
tions have  adopted  really  comprehensive  metrics  collection  and  analysis  programs.  Reliability  is 
the  only  area  that  appears  to  have  received  attention.  In  general  little  progress  has  been  made  in 
convincing  organizations  that  it  is  in  their  best  interests  to  collect  this  information  and  analyze  it  as 
a"  prerequisite  to  improving  their  software  process. 


OBJECT-ORIENTED  METHODS 


According  to  Boehm,  there  was  little  in  1980  that  represented  a different  paradigm  around  which  a 
software  product  could  be  designed  and  organized.  However,  object-oriented  methods,  technolo- 
gies, and  program  languages  gained  prominence  and  began  substantially  to  alter  system  designers' 
and  developers’  thinking.  Potentially,  a significant  achievement  in  the  1990s  is  fully  developing  all 
of  these  object-oriented  technologies  into  an  integrated  support  environment. 


SOFTWARE  ENGINEERING  EDUCATION 

Boehm  indicated  that  another  huge  disappointment  has  been  software  engineering  education.  In 
1980,  a working  group  put  together  the  IEEE  model  curriculum  for  a master’s  degree  in  software 
engineering  that  resulted  in  a careful  well-rounded  program.  Unfortunately,  the  IEEE  did  not 
follow  through  on  it.  A curriculum  was  subsequently  created  at  the  Wang  Institute,  where  it  was  a 
strong  success  until  the  Institute’s  dissolution.  Fortunately,  the  seeds  of  that  work  are  reflected  in 
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the  SETs  curriculum  for  a master’s  degree  in  software  engineering.  However,  software  engineering 
education  at  the  undergraduate  level  is  still  abysmal— people  are  taught  programming,  not  soft* 
ware  engineering. 

A significant  achievement  is  that  the  SEI  has  picked  up  the  gauntlet  on  such  topics  as  education, 
process  assessment,  and  researching  real-time  Ada  systems  development  issues.  Because  of  this 
progress,  the  software  engineering  community  has  a far  better  understanding  of  and  access  to  these 
areas  than  was  possible  at  the  beginning  of  the  1980s.  However,  Boehm  felt  there  is  still  much  to  be 
done  and  that  the  SEI  will  be  a major  contributor  in  the  1990s. 
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LARRY  DRUFFEL 


Larry  Druffcl  is  the  Director  of  the  Software  Engi- 
neering Institute.  He  has  been  associated  with  Ada 
since  1978  and  was  the  first  Director  of  the  Ada  Joint 
Program  Office. 


Druffel’s  presentation  discussed  four  major  disappointments  and  six  major  achievements  in  soft- 
ware engineering  during  the  1980s: 


MAJOR  ACHIEVEMENTS 

MAJOR  DISAPPOINTMENTS 

ADA 

ADA 

FOCUS  ON  PROCESS 

SLOW  ACCEPTANCE  OF  SOFTWARE 
ENGINEERING  EDUCATION 

SOFTWARE  ENGINEERING  EDUCATION 

LACK  OF  TOOLS  INTEGRATION  AND 

SOFTWARE  ARCHITECTURE 

STANDARDIZATION 

SOFTWARE  ENGINEERING  ENVIRONMENTS 

CODE  REUSE 

OBJECT-ORIENTED  DESIGN  NOTION 

1 

PROJECTIONS  FOR  THE  FUTURE 

SOFTWARE  ARCHITECTURE  DESCRIPTION 

MATURATION  OF  OBJECT-ORIENTED  TECHNIQUES 

IMPROVED  DATA  COLLECTION 

Druffel  briefly  addressed  first  the  achievements  and  then  the  disappointments: 

• Ada:  Druffel  stressed  that  the  adoption  of  Ada  as  an  ANSI  and  ISO  standard  must  be 
considered  one  of  the  major  achievements  of  software  engineering  in  the  1980s.  Benefits 
of  the  Ada  standard  are  the  tremendous  strides  seen  in  reuse  and  transportability  and 
the  enormous  amount  of  private  investment  that  has  been  made  in  the  development  of 
tools,  the  maturation  of  compilers,  and  the  support  of  optimization  techniques. 

e Focus  on  Process:  The  focus  on  the  software  engineering  process  is  an  outstanding 
achievement  He  cited  the  Space  Shuttle  work  as  a particularly  good  example  of  the  pro- 
gram improvements  achievable  through  such  a focus.  Druffel  stressed  that  it  is  impor- 
tant that  this  focus  on  process  be  supported  by  measurement  and  education.  The  most 
successful  efforts  in  improving  process  have  had  at  least  ad  hoc  measures  to  support 
them. 

• Software  Engineering  Education:  The  principal  achievement  in  the  area  c*  software  en- 
gineering education  has  been  the  emergence  of  the  idea  that  there  is  now  enough 
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codified  information  that  software  engineering  can  be  taught  The  notion  has  become 
accepted  and  there  are  people  willing  to  teach  software  engineering. 

• Software  Architecture:  Perhaps  one  of  the  greatest  achievements  is  that  there  is  now  a 
realization  that  software  architecture  needs  attention— real  analysis  on  the  structure  of 
a system. 

• Software  Engineering  Environments:  There  has  been  marked  progress  in  the  past  dec- 
ade in  recognizing  the  need  for  improved  software  engineering  environments.  In  the 
past,  everything  was  ad  hoc,  using  a simple  set  of  tods,  with  little  or  no  appreciation  for 
integrating  these  tools  into  a cohesive  environment 

• Object-Oriented  Design  Notion:  Druffel  stated  that  we  have  seen  the  emergence  of  a 
whole  new  notion  of  managing  objects  and  designing  systems  by  objects,  even  though  the 
notion  of  an  object  is  not  yet  clearly  understood  or  universally  defined.  The  appearance 
of  these  object-oriented  techniques  and  the  way  they  are  changing  our  approach  to  sys- 
tems development  are  very  important  advances. 

Druffel’s  list  of  major  disappointments  included  the  following: 

• Ada:  The  Ada  program  has  also  yielded  a major  disappointment  as  well  as  major 
achievements  during  the  1980s.  In  Druffel’s  opinion,  the  major  disappointment  has 
been  the  failure  of  the  software  engineering  community  (including  DOD,  industry,  and 
particularly  the  academic  world)  to  take  advantage  of  Ada  as  fully  and  as  rapidly  as 
could  have  been  done. 

• Lack  of  Acceptance  of  Software  Engineering  in  the  Academic  Community:  The  disap- 
pointment in  software  engineering  education  is  that  the  academic  community  at  large 
doesn’t  seem  to  want  to  accept  the  idea  of  engineering  software.  Druffel  stated  that  this 
limited  perspective  is  holding  the  industry  back. 

• Lack  of  Standardization  in  the  Integration  of  Computer-Aided  Software  Engineering 
Tools:  Great  progress  has  been  made  in  the  development  of  computed-aided  software 
engineering  tools.  However,  it  is  a disappointment  that  there  has  been  a lack  of  any 
standardization  to  enable  effective  use  on  a project  without  a lot  of  manual  intervention. 

• Code  Reuse:  As  in  the  decade  of  the  70s,  code  reuse  remains  a major  area  of  disappoint- 
ment Some  progress  is  visible  in  narrow  application  domains,  but  wider  reuse  will  not 
be  realized  until  domain  analysis  and  designing  for  reuse  are  given  more  emphasis. 

SOFTWARE  ENGINEERING  ADVANCES  IN  THE  NEXT  5 YEARS 

Druffel  discussed  several  projected  software  engineering  advances  for  the  coming  decade: 

• Software  Architecture:  Druffel  postulated  that  software  architecture  description  may 
be  one  of  the  major  achievements  in  the  next  5 years.  There  will  be  the  capability  to 
describe  software  architecture,  agreement  on  the  language  and  symbolic  representation, 
and  the  development  of  analytical  techniques  to  determine  for  which  applications  an 
architecture  is  appropriate  and  for  which  it  is  not 

• Object-Oriented  Design:  There  should  be  a tremendous  maturation  of  object-oriented 
design  and  related  object-oriented  techniques,  allowing  for  their  effective  use. 


Data  Collection:  The  next  S years  should  see  data  collection  follow  a more  structured 
approach  and  the  development  of  consistent  definitions  to  allow  reasonable  analysis 
across  projects.  This  will  be  possible  only  because  of  the  work  done  through  NASA/ 
Goddard’s  series  of  Software  Engineering  Laboratory  workshops  and  the  contributions 
of  those  at  the  University  of  Maryland  and  the  SEL  to  this  effort. 
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MANNY  LEHMAN 


Manny  Lehman  is  a professor  of  Computer  Science 
at  the  Imperial  College  in  England  He  has  done  con- 
siderable work  in  software  measurement  and  has  au- 
thored over  100  technical  papers. 


Lehman  presented  his  initial  list  of  the  most  significant  achievements  and  disappointments  in  soft- 
ware engineering  during  the  1980s,  placing  Computer  Aided  Software  Engineering  (CASE)  tools  in 
both  categories: 


MAJOR  ACHIEVEMENTS 

MAJOR  DISAPPOINTMENTS 

CASE  TOOLS 

CASE  TOOLS 

RECOGNITION  AND  ACCEPTANCE  OF  THE 

LACK  OF  CASE  PENETRATION  INTO 

EVOLUTIONARY  NATURE  OF  SOFTWARE 

THE  INDUSTRY 

FOCUS  ON  PROCESS  AND  PROCESS 

INADEQUATE  CASE  SUPPORT 

MODELS 

ENVIRONMENTS 

MORE  DISCIPLINED  SOFTWARE 

INADEQUATE  UNDERSTANDING  OF 

ENGINEERING  PROCESS 

SOFTWARE  ENGINEERING 

PROJECTIONS  FOR  THE  FUTURE 

IMPROVED,  INTEGRATED  CASE  ENVIRONMENTS 

IMPROVED  COST  EFFECTIVENESS  THROUGH  CASE 

Lehman’s  discussion  of  the  most  significant  achievements  included  four  items,  all  interrelated  in  a 
logical  progression: 

• The  software  engineering  community  is  demonstrating  an  increasing  appreciation  of  the 
intrinsically  evolutionary  nature  of  software. 

• Clear  evidence  exists  that  the  community  has  recognized  the  significance  of  the  software 
development  process  and  of  process  models. 

• Wider  appreciation  and  acceptance  have  been  evident  in  the  academic  and  research 
communities  (and  are  now  emerging  in  industry)  of  the  importance  of  discipline, 
method,  formality  and  mechanization  in  the  software  process. 

• The  community  has  recognized  the  need  for  significant  CASE  tools  and  the  integrated 
support  environments  to  promote  their  effective  use. 

According  to  Lehman,  although  there  may  be  a very  solid  understanding  of  a process,  turning  the 
process  into  reality  and  executing  it  with  precision  require  discipline,  method,  and  formality.  These 
requirements  clearly  indicate  the  need  for  automated  support,  and  while  many  tools  have  been 
emerging,  industry  has  not  yet  provided  the  integrated  support  environments  necessary  to  exploit 
their  power. 
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CASE  development  has  primarily  arisen  from  the  search  for  productivity  growth  in  software  devel- 
opment Consequently,  Lehman  sees  the  largest  disappointment  to  be  the  failure  of  CASE  to  de- 
liver this  evident  productivity  growth.  Precisely  because  CASE  tools  have  not  shown  the  desired 
productivity  benefits,  their  use  hasn’t  penetrated  widely  or  deeply  into  the  industrial  or  commercial 
software  development  arenas. 

Another  disappointment  has  been  the  very  slow  development  of  satisfactory,  comprehensive, 
transferable,  and  usable  support  environments.  Lehman  believes,  however,  that  the  reasons  for 
this  are  now  sufficiently  well  understood  He  projected  that  over  the  next  decade  there  will  be  ad- 
vances in  this  area,  and  that  much  wider  penetration  of  CASE  tools  will  then  follow. 

A final  major  disappointment  has  been  the  failure  to  achieve  wide  industry  appreciation  of  the  true 
meaning  of  software  engineering  and  the  role  of  software  engineers.  The  community  at  large  seems 
to  use  the  terms  ’software  engineer*  and  ‘programmer*  interchangeably.  Lehman  stressed  that 
these  terms  are  not  the  same;  they  are  two  quite  different  yet  complementary  roles.  Programming 
and  the  programmer  are  product  oriented.  The  software  engineer  is  primarily  a process  engineer, 
focusing  on  designing  the  processes  the  programmers  use,  on  the  methods  to  be  followed,  on  the 
tools  that  can  be  applied  and  on  the  organizational  aspects  of  software  development  projects. 

Lehman  concluded  with  his  prediction  that  CASE  progress  in  the  coming  years  will  be  significant, 
but  only  if  people  understand  what  CASE  really  is,  how  it  can  be  transferred  to  industry,  how  it  can 
be  applied  effectively,  and  how  and  when  benefits  can  be  assessed.  Central  to  achieving  this  prog- 
ress is  the  need  for  industry  to  develop  the  proper  types  of  CASE  environments  through  the  integra- 
tion of  existing  tools,  creating  comprehensive  software  development  support  environments. 

In  summary,  Lehman  emphasized  that  the  primary  goal  of  CASE  cannot  be  immediate  productiv- 
ity growth,  cost  reduction,  or  visible  improvements  in  product  quality.  Ironically,  early  CASE  use 
will  lead  to  increased  costs,  because  training,  tool  and  workstation  acquisition,  and  production  mo- 
mentum loss  all  have  an  initial  cost.  What  needs  to  be  considered  is  what  will  happen  in  the  long 
term:  improved  cost-effectiveness  over  time  associated  with  higher  quality,  more  reliable,  and 
more  adaptable  software  is  the  ultimate  benefit.  Realizing  this  benefit  depends  on  using  CASE 
long  enough  to  aggregate  small,  invisible  benefits  into  larger,  visible  benefits.  The  goal  of  CASE  is 
institutionalizing  a process  that  achieves  and  maintains  user  satisfaction  with  the  software  product, 
ultimately  leading  to  productivity  growth  and  revenue  growth. 

Despite  its  great  potential,  however,  Lehman  cautioned  that  obstacles  to  CASE  progress  include  a 
long  lead  time  before  benefits  will  be  realized;  the  need  for  major  financial  investments  by  the  user; 
difficult  cost-benefit  analyses  because  of  the  many  imponderables  related  to  CASE;  and  a delay  in 
truly  quantifying  the  benefits  of  CASE  until  the  industry  gains  significantly  more  experience  in  its 
use. 
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HARLAN  MILLS 


Harlan  Mills  is  currently  the  president  and  chief 
technical  officer  of  Software  Engineering  Tech- 
nology, Inc.  (SET).  He  has  had  a distinguished  tech- 
nical career,  has  authored  over  50  refereed  papers, 
and  has  won  numerous  software-related  awards. 


Several  of  the  advances  identified  by  Mills  addressed  process  and  methodology  topics,  while  the 
disappointments  cited  deficiencies  often  characteristic  of  delivered  software  products: 


MAJOR  ACHIEVEMENTS 

MAJOR  DISAPPOINTMENTS 

SPIRAL  DEVELOPMENT  MODEL 

LACK  OF  ENGINEERING  0 ISO  PUNE 
FOR  SOFTWARE  DEVELOPMENT 

METRICS 

POOR-QUALITY  SOFTWARE 

SOFTWARE  ENGINESTING  INSTITUTE  (SS) 

LOW  PRODUCTIVITY 

CLEANROOM  DEVELOPMENT 
METHODOLOGY 

MISSED  SCHEDULES 

PROJECTIONS  FOR  THE  FUTURE 

FORMALIZATION  OF  SPIRAL  MODELS 

SB  GROWTH 

IMPROVED  METRICS 

EXPANDED  USE  OF  CLEANROOM  METHODOLOGY 

Mills  discussed  four  areas  he  felt  have  represented  significant  achievements  in  software  engineer- 
ing over  the  past  decade: 

• Spiral  Model:  The  spiral  model  of  software  development,  articulated  best  by  Barry 
Boehm,  may  replace  the  traditional  waterfall  model 

• Metrics:  Significant  developments  in  metrics  for  software  technical  management  have 
emerged  from  Barry  Boehm,  Vic  Basili,  and  other  people  in  the  field. 

• SEI:  A national  resource  Las  been  established  in  the  Software  Engineering  Institute. 

• Geanroom  Methodology:  Encouraging  progress  has  been  seen  in  applications  of  the 
Geanroom  approach  to  the  engineering  of  software  under  statistical  quality  control 
emphasizing  the  quality  of  people. 

Mills’  list  of  the  greatest  disappointments  over  the  past  decade  stems  from  the  following  observa- 
tion: 

“Software  engineering  is  used  as  a buzzword:  Software  engineering  isn't  being 
treated  as  a real  engineering  discipline." 
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Mills  feels  that  the  people  called  "software  engineers”  today  are  typically  programmers,  and  that 
they  should  be  upgraded  in  some  way  through  better  training  and  the  introduction  of  more  disci- 
pline and  mathematical  formalism  into  the  software  development  process.  Mills  continued  that 
this  perspective  has  led  to  the  continued,  widespread,  and  unnecessary  existence  of  three  condi- 
tions: 

• Poor  quality  unreliable  software 

• Low  productivity  in  software  development  It  has  been  demonstrated  that  improvement 
in  productivity  by  factors  of  5 and  10  can  be  achieved  when  people  "learn  to  do  it  right" 
as  professionals. 

• Missed  schedule  of  software  deliveries:  Getting  things  done  on  time  is  rarely  achieved  in 
the  industry. 

To  further  support  his  position.  Mills  discussed  the  properties  of  Geanroom  engineering  and  why 
this  method  is  worthwhile: 

• Statistical  Usage  Specifications  (As  Weil  as  Performance  and  Functional  Specifica- 
tions): Tb  talk  about  the  reliability  of  software.  Mills  feels  we  need  to  know  the  ways  in 
which  that  software  is  to  be  used.  This  knowledge  will  better  guide  us  to  test  particular 
portions  of  the  software  more  rigorously  than  other  portions.  We’re  testing  in  the  wrong 
kinds  of  ways  in  the  development  shops  and  are  waiting  to  see  how  the  customer  actually 
uses  the  software  to  determine  where  testing  is  really  needed. 

• Software  Development  in  a Pipeline  of  Increments  with  Separate  Certification:  Soft- 
ware engineers  are  learning  to  avoid  the  need  to  debug  programs.  Cleanroom  involves  a 
pipeline  of  increments  (such  as  10,000  lines  of  code)  that  can  be  written  without  debug- 
ging and  that  are  very  dose  to  error  free. 

• Scaled-up  Informal  Verification  of  Software  to  Meet  Specifications:  This  involves  using 
mathematical  ideas,  such  as  scaling  up  using  axiomatic  verification,  etc.,  on  100.000  or 
300,000  line  programs. 

• Producing  Software  without  Private  Debugging  Before  Public  Certification  Testing: 
The  people  who  certify  the  software  do  the  testing. 

Mills  identified  four  areas  of  potential,  significant  achievement  in  software  engineering  over  the 
next  decade: 

• Formalization  of  Spiral  Models  of  Software  Development  for  Procurement/ 
Management:  Software  engineering  has  to  have  a mathematical  foundation,  and  this  is 
one  good  area  with  which  to  begin. 

• Metrics:  Continued  development  of  metrics  for  software  technical  management 

• Continued  Growth  of  the  SEI:  This  involves  extending  its  influence  over  software  engi- 
neering. 

• Expanded  Use  of  Software  Engineering  Under  Cleanroom  Engineering  (or  Something 
Similar),  with  Statistical  Quality  Control:  This  requires  using  real  engineering  and  not 
cut-and-dried  programming. 
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Vic  Basili,  professor  of  Computer  Science  at  the  Uni- 
versity of  Maryland,  has  authored  over  100  papers  in 
software  engineering,  metrics,  and  methods,  and  is 
editor-in-chief  of  Transactions  on  Software  Engi- 
neering. 


As  the  facilitator  for  this  panel  and  one  of  the  principals  in  15  years  of  SEL  research,  Vic  Basili 
presented  *he  SEL  viewpoint  on  the  most  significant  achievements  and  greatest  disappointments  in 
software  engineering  over  the  past  10  years: 


MAJOR  ACHIEVEMENTS 

MAJOR  DISAPPOINTMENTS 

IMPORTANCE  OF  PROCESS  AND 
FORMAL  METHODS 

MATURING  TAKING  TOO  LONG 

INADEQUATE  UNDERSTANDING  OF 

COMMUNITY  RECOGNITION  OF  NEED 

TECHNOLOGY  BUILDING* 

FOR  MULTIPLE  LIFE-CYCLE  MOOELS 

MEASUREMENT  NOT  SUFFICIENTLY 

MEASUREMENT  AND  METRICS 

WIDESPREAD 

OBJECT-ORIENTED  METHODS 

LIMITED  AUTOMATED  SOFTWARE 
DEVELOPMENT  SUPPORT 

ADA 

FEW  ADVANCES  If ' TESTING  TECHNIQUES 

PROJECTIONS  FOR  THE  FUTURE 


FOCUS  ON  ENGINEERING  SOFTWARE 
WIDER  USE  OF  MEASUREMENT 
REUSING  PACKAGED  EXPERIENCE 
MATURING  OF  PERSONNEL  SKILLS 
INCREASED  AUTOMATED  SUPPORT 

Some  of  the  earlier  panelists  had  classified  a given  item  as  both  an  achievement  and  a disappoint- 
ment Basili  noted  that  this  was  often  because  the  item  was  an  “achievement,"  but  it  hadn’t  moved 
fast  enough  or  been  adequately  assimilated  into  the  community  and  so  was  also  considered  a disap- 
pointment Basili  discussed  the  achievements  in  two  categories: 

• Maturing 

Process  and  Methods:  Increased  recognition  of  the  importance  of  process  and  for- 
mal methods  has  been  a major  achievement  over  the  past  10  years.  The  SEI  proc- 
ess assessment  has  raised  the  level  of  concern  about  process,  and  formal  methods  > 

such  as  verification  and  Cleanroom  have  shown  movement  in  the  right  direction. 
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Multiple  Life-Cycle  Models:  Recognition  of  the  need  for  multiple  life-cyck  inode's 
and  methods  has  been  another  major  area  of  progress.  There  is  no  longer  only  a 
waterfall  model;  a spiral  model  and  prototyping  models  have  also  been  introduced. 
There  are  different  ways  of  doing  things.  There  is  no  longer  a one-model  mentality 
in  the  community. 

• Technologies 

Measuranent/Metrics:  Measurement  and  metrics  techniques  have  matured,  and 
the  community  now  better  understands  how  to  apply  them.  The  use  of  measures/ 
metrics  may  not  be  widespread  enough,  but  from  an  SEL  perspective  the  technol- 
ogy exists  to  do  the  kinds  of  things  needed  to  reach  Level  5 of  the  SEI  software 
development  maturity  matrix. 

Otjfect-OricnAed  Techniques:  The  use  of  data  abstractions  and  object-oriented 
methods  has  significantly  altered  the  way  current  systems  are  being  designed  and 
implemented. 

Ada:  Ada’s  benefits  are  as  much  related  to  its  promoting  wider  use  of  improved 
software  engineering  techniques  as  they  are  related  to  the  language-specific  fea- 
tures of  Ada. 

Basili  discussed  several  disappointments,  many  of  them  parallel  to  items  included  in  his  achieve- 
ments list: 


• Maturing  has  taken  so  long. 

• Some  people  are  still  looking  for  “magic"— a “silver  bullet"  that  will  provide  a quick  cure 
for  our  software  development  problems.  What  we  must  realize  is  that  process  improve- 
ments require  technology  building,  maturing,  evolution,  and  evaluation. 

• Although  some  organizations  have  implemented  measurement  programs,  a general  I ack 
of  measurement  and  formal  methods  pervades  the  industry. 

• Too  little  effective  automated  support  for  software  development  exists.  CASE  is  an  ex- 
ample of  a disappointment  in  this  category.  CASE  is  a bottom-up  issue,  where  pieces  of 
technology  have  been  dealt  with  but  CASE  hasn’t  solved  what  people  are  actually  trying 
to  do. 

• While  much  attention  has  been  given  to  the  design  and  implementation  areas  of  soft- 
ware development,  few  advances  in  testing  practices  have  been  made.  There  are  many 
test  techniques  in  the  literature,  but  most  are  not  available  in  practice.  This  is  partly 
because  the  tests  exist  only  at  an  academic  level  rather  than  as  pragmatic  ways  of  doing 
testing  in  the  real-world. 

Basili  categorized  future  achievements  as  those  to  be  attained  over  the  next  5 years  and  those  to  be 
attained  over  the  next  10  years. 

• Next  5 Years 

Focus  on  engineering  software:  This  is  the  issue  of  discipline  and  evolutionary 
process. 
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Measurement:  Wider  evidence  of  process  improvement  through  measurement  is 
expected. 

Reuse  of  packaged  experience:  This  includes  code,  as  long  as  it  is  done  in  the  right 
context,  i.e.,  as  long  as  reuse  is  done  for  a product  unit  in  the  context  of  architecture 
and  in  the  context  of  processes  that  produce  the  appropriate  kinds  of  useful  ob- 
jects. 

• Next  10  Years 

Real  automated  support:  This  will  require  the  natural  evolution  of  supporting 
processes. 

Maturing  of  personnel:  The  industry  needs  staff  with  consistent  backgrounds. 
Over  a long  period  of  time,  people  who  speak  the  same  language  and  haw  similar 
kinds  of  training  will  become  more  prevalent  in  the  software  engineering  commu- 
nity. 
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SOFTWARE  ENGINEERING  LABORATORY  (SEL) 

BACKGROUND 

• EARLY  1970’s 

- INCREASING  REALIZATION  OF  SOFTWARE  ROLE  (COST,  IMPORTANCE,...) 

- NUMEROUS  SOFTWARE  TECHNOLOGIES  EVOLVING/MATURING 

• TOOLS  (DEBUG  AIDS,  CODE  ANALYZERS,...) 

• STRUCTURED  DESIGN/ANALYSIS/IMPLEMENTATION... 

• TESTING/VERIFICATION  TECHNIQUES 

• MANAGEMENT  METHODS  AND  AIDS 

- NO  GUIDANCE  FOR  SELECTING/APPLYING  EVOLVING  SOFTWARE  TECHNIQUES 

• LIMITED  EMPIRICAL  EVIDENCE 

• OVER  DEPENDENCE  ON  MOST  VOCAL  ADVOCATES 

• 1975  - 76’ 

- PARTNERSHIP  BETWEEN  NASA/GSFC  AND  UNIVERSITY  OF  MARYLAND  FORMED 

• EXPERIMENT  WITH  AVAILABLE  TECHNIQUES  IN  PRODUCTION  ENVIRONMENT  (NASA/GSFC) 

• DETERMINE  WHICH  TECHNIQUES  ARE  EFFECTIVE  (MEASUREMENT) 

• INFUSE  IDENTIFIED  TECHNOLOGY  BACK  INTO  PROCESS  (PROCESS  IMPROVEMENT) 

- CSC  - AS  PRIMARY  FLIGHT  DYNAMICS  SOFTWARE  CONTRACTOR  BECOMES  3rd  PARTNER 

*?.  • 1976-90’ 

1 1 - CONSISTENT  PARTNERSHIP  IN  SEL  (NASA/UM/CSC) 

8J  - EVOLUTION  TOWARD  UNDERSTANDING  SOFTWARE  AND  OPTIMIZING  PROCESS 
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SOFTWARE  ENGINEERING  LABORATORY 
ESTABLISHED  1975  - 76’ 


• GOALS 

- UNDERSTAND  THE  SOFTWARE  PROCESS  IN  A PRODUCTION  ENVIRONMENT 

- DETERMINE  IMPACT  OF  AVAILABLE  TECHNOLOGIES 

- INFUSE  IDENTIFIED/REFINED  METHODS  BACK  INTO  DEVELOPMENT  PROCESS 

• APPROACH 

- IDENTIFY  TECHNOLOGIES  WITH  HIGH  POTENTIAL 

- APPLY  AND  EXTRACT  DETAILED  DATA  IN  PRODUCTION  ENVIRONMENT  (EXPERIMENTS) 

- MEASURE  IMPACT  (COST,  RELIABILITY,  QUALITY,...) 

• SIGNIFICANT  CONSIDERATIONS 

- CANDIDATE  PROJECTS  (IMPACTS/COST/...) 

- DATA/INFORMATION  (DEFINING/PROCESSING A/ ALIDITY/...) 

- EXPERIMENTAL  DESIGN  (CLASSES  OF  STUDIES/DOMAIN  ANALYSIS...) 
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PRODUCTION  ENVIRONMENT 


SOFTWARE  CHARACTERISTICS 

• SCIENTIFIC  (FLIGHT  DYNAMICS) 

• GROUND  BASED  (NON-EMBEDDED) 

• INTERACTIVE 

LANGUAGES 

• 75%  FORTRAN 

• 15%  Ada 

• 10%  OTHER  (C,  PASCAL,  LISP,  ...) 

PROJECT  CHARACTERISTICS 

TYPICAL 


• DURATION  (MONTHS)  24-40 

• EFFORT  (STAFF  YEARS)  30-45 

• SIZE  (KSLOC)  100-300 

• STAFF  (FTE)  5-15 


‘HOMOGENEOUS  CLASS 
OF  SOFTWARE 

‘CONSISTENT  SUPPORT 
ENVIRONMENT 

‘CONTROLLED  PROCESS 


‘SEVERAL  PROJECT  EXCEED  200  STAFF  YEARS 
‘SEVERAL  LESS  THAN  3 STAFF  YEARS 


SEL 

EVOLVING  “PROCESS  IMPROVEMENT”  ENVIRONMENT 

WHAT  HAS  BEEN  LEARNED? 


1.  HOW  HAS  THE  MODEL  OF  PROCESS  IMPROVEMENT  EVOLVED? 

- PHASES  TOWARD  PROCESS  IMPROVEMENT 

- WHAT  CLASSES  OF  EXPERIMENTS/STUDIES  ARE  NEEDED 

- DO  WE  UNDERSTAND  THE  IMPROVEMENT  PARADIGM  BETTER  TODAY 

2.  WHAT  IMPACT  HAS  THE  EXPERIMENTATION  HAD  ON  A PRODUCTION  ENVIRONMENT? 

- IS  SOFTWARE  DEVELOPED  DIFFERENTLY  NOW  (BETTER?) 

- HAS  THERE  BEEN  A CHANGE  IN  ATTITUDE  OF  DEVELOPERS 

- WHAT  CHANGES  HAVE  BEEN  MADE  AS  A RESULT  OF  'STUDIES' 

3.  WHAT  HAS  BEEN  LEARNED  FROM  THE  STUDIES? 

- ARE  THERE  TECHNIQUES  THAT  HELP  (WHICH  ONES) 

- WHAT  ARE  KEY  ASPECTS  OF  PROCESS  IMPROVEMENT 

- WHAT  ARE  MAJOR  IMPACTS/LESSONS  FOR  NASA 
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Towards  a Mature 
Measurement  Environment: 

Creating  a Software  Engineering  Research 

Victor  R.  Basili 

Institute  for  Advanced  Computer  Studies  . n v 

Department  of  Computer  Science  M ■' 

University  of  Maryland 
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Software  Engineering  Research 

Software  engineering  researchers  are  building  tools,  defining  methods  and  models. 
However,  there  are  problems  with  the  nature  and  style  of  the  research.  The  research  is 
typically  bottom-up,  done  in  isolation  so  the  pieces  cannot  be  easily  logically  or 
physically  integrated.  A great  deal  of  the  research  is  essentially  the  packaging  of  a 
particular  piece  of  technology  with  little  indication  of  how  the  work  would  be  integrated 
with  other  pieces  of  research.  The  research  is  not  aimed  at  solving  the  real  problems 
of  software  engineering,  i.e.,  the  development  and  maintenance  of  quality  systems  in 
a productive  manner.  The  research  results  are  not  evaluated  or  analyzed  via 
experimentation  or  refined  and  tailored  to  the  application  environment.  Thus,  it  cannot 
be  easily  transferred  into  practice.  Because  of  these  limitations  we  have  not  been  able 
lo  understand  the  components  of  the  discipline  as  a coherent  whole  and  the 
relationships  between  various  models  of  the  process  and  product. 

What  is  needed  is  a top  down  experimental,  evolutionary  framework  in  which  research 
can  be  focused,  logically  and  physically  integrated  to  produce  quality  software 
productively,  and  evaluated  and  tailored  to  the  application  environment.  This  implies 
the  need  for  experimentation,  which  in  turn  implies  the  need  for  a laboratory  that  is 
associated  with  the  artifact  we  are  studying.  This  laboratory  can  only  exist  in  an 
environment  where  software  is  being  built,  i.e.,  as  part  of  a real  software  development 
and  maintenance  organization.  Thus  we  propose  that  Software  Engineering 
Laboratory  (SEL)  type  activities  exist  in  all  organizations  to  support  software 
engineering  research. 

In  this  paper  we  will  try  to  describe  the  SEL  from  a researcher's  point  of  view.  Jerry 
Page  and  Frank  McGarry  will  discuss  the  corporate  and  government  benefits  of  the 
SEL  I will  try  to  focus  my  discussion  on  the  benefits  to  the  research  community. 

The  SEL  as  a Research  Laboratory 

The  SEL  is  a laboratory  that  allows  us  to  understand  the  various  software  processes, 
products  and  other  experiences,  build  descriptive  models  of  them,  understand  the 
problems  associated  with  building  software,  develop  solutions  focused  on  the 
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problems,  experiment  with  the  proposed  solutions  and  analyze  and  evaluate  their 
effects,  refine  and  tailor  these  solutions  for  continual  improvement  and  effectiveness 
and  enhance  our  understanding  of  their  effects,  and  build  relevant  models  of  software 
engineering  experiences. 

The  SEL  has  been  in  business  for  over  15  years  and,  based  upon  our  experiences,  its 
activities  have  evolved  over  time.  In  this  section,  I will  describe  the  activities  as  they 
progressed  over  three  phases. 

The  first  phasel  'will  call  the  understanding  phase  because  we  worked  on 
understanding  what  we  could  about  the  environment  and  measurement.  During  this 
period  we  measured  what  we  could,  used  available  models  to  explain  the 
environment  and  our  behavior,  and  built  descriptive  baselines  and  models  typifying 
our  environment. 

In  retrospect  we  made  several  mistakes.  We  collected  too  much  data,  i.e.,  because  we 
did  not  know  what  was  important  we  tended  to  collect  all  kinds  of  data  hoping  they 
would  give  us  insights  into  the  environment.  We  often  blindly  applied  models  and 
metrics  without  understanding  the  subtle  assumptions  and  whether  they  were  relevant 
in  our  environment.  In  a sense,  we  tried  to  evaluate  things  before  we  had  built  a deep 
understanding  of  what  we  were  evaluating.  We  finally  began  to  understand  that 
measurement  needed  to  be  based  upon  models  and  goals.  We  established  goals  and 
a mechanism  for  generating  measures  based  upon  those  goals,  the  first,  primitive 
version  of  the  Goal/Question/Metric  Paradigm.  This  provided  an  informal  approach  to 
organizing  our  data.  Based  upon  our  goals,  we  began  to  build  environment  specific 
models  by  accumulating  knowledge  on  individual  projects  and  building  baselines 
across  multiple  projects.  Eventually  we  developed  descriptive  models  that 
characterized  the  environment.  These  models  included  models  of  resources,  defects, 
and  product  characteristics. 

Once  we  had  an  understanding  or  characterization  of  the  environment  and  the 
projects  we  were  developing,  we  were  able  to  begin  the  process  of  evaluation  by 
comparing  new  projects  against  our  baselines.  This  allowed  us  to  proceed  to  phase 
two  where  the  focus  was  on  improving  the  process,  product,  and  environment. 
During  this  phase,  we  continued  to  build  up  our  data  base  of  baselines  and  models, 
but  we  also  evaluated  and  fed  back  information  to  the  project.  Many  of  these  early  data 
models  were  informal.  The  data  was  saved  in  a data  base  but  the  models  existed 
mostly  in  documents.  We  began  to  experiment  with  various  technologies  to 
understand  their  effect,  i.e.  how  they  changed  the  baselines  or  the  models  we  had.  In 
order  to  provide  a learning  process  across  projects  that  would  allow  us  to  take 
advantage  of  what  we  had  learned  and  evolve,  we  developed  the  Quality 
Improvement  Paradigm,  which  is  based  upon  an  evolutionary,  experimental  approach 
to  software  improvement  based  upon  both  project  and  organizational  feedback  loops. 
The  Goal/Question/Metric  Paradigm  continued  to  evolve  to  recognize  different  types  of 
goals  and  questions  and  take  advantage  of  the  multi-project  perspective.  We  began 
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formafizing  process,  product,  knowledge  and  quality  models. 

This  need  for  formalization  within  the  context  of  the  Improvement  Paradigm  led  to  the 
concept  of  packaging  models  of  our  experiences  so  they  were  reusable  on  other 
projects.  During  this  third  phase  we  worked  on  choosing  potentially  reusable 
experiences,  recognizing  what  was  appropriate  and  relevant  for  the  SEL  We  began 
studying  notations  and  mathematical  formalisms  for  defining  experiences. 

There  are  several  examples  of  current  research  projects  in  packaging  experiences. 
For  example,  we  are  working  on  a project  characterization  model  that  allows  us  to 
recognize  project  patterns  so  that  we  can  predict  which  projects  look  like  the  one  we 
are  working  on.  This  allows  us  to  package  data  for  use  as  cost  estimation  models 
based  upon  our  relevant  past  history  [Briand,  Basili,  Thomas].  Having  recognized  that 
most  experiences  need  to  be  modified  for  use,  we  have  been  defining  models  of 
tailorable  experiences.  For  example,  we  are  working  on  a tailorable  test  method 
[Basili,  Martschenko,  Swain].  The  method  allows  one  to  choose  the  appropriate  test 
techniques  based  upon  the  defect  history  of  similar  projects  and  the  success  rate  of 
the  techniques  in  that  environment.  Another  example  is  the  development  of  a model  cr 
reference  architecture  for  different  types  of  software  factories  [Basili,  Caldiera  and 
Cantone].  We  are  defining  process  models  for  reusing  experience.  We  have 
developed  a reuse-  oriented  evolution  model  [Basili  and  Rombach]  and  are  working 
on  integrating  experience  models  [Oivo  and  Basil!].  We  have  developed  the  concept  of 
an  Experience  Factory,  whose  goal  is  to  package  software  experiences  and  provide 
them  to  projects  upon  demand  and  have  integrated  the  concept  with  an  evolved  QIP 
and  GQM. 


Understanding 

Modeling  environment 
Data  Collection  (GQM) 
Resource  Baselines 
Defect  Baselines 


Improving 

Methodology  Evaluation 
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Figure  1.  Evolution  of  Measurement/Studies  in  the  SEL 
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Figure  1 represents  some  of  the  studies  we  performed  and  the  hierarchy  of  the 
process,  one  phase  based  upon  the  other.  That  is,  there  was  an  understanding 
process  (Phase  1 ),  followed  by  an  improving  process  (Phase  2),  followed  by  a 
packaging  process  (phase  3).  You  can't  improve  until  you  understand,  and  you  can't 
package  until  you  can  assess  and  improve.  We  are  still  understanding  and  trying  to 
improve;  these  activities,  along  with  packaging,  will  go  on  forever. 

The  Research  Framework  Concepts 

We  have  evolved  to  a framework  [Basil!  b]  that  is  based  on  three  basic  concepts,  each 
of  which  is  itself  evolving: 

o The  Quality  Improvement  Paradigm  (QIP),  an  evolutionary  improvement  paradigm, 
based  upon  the  scientific  method,  tailored  for  the  software  engineering, 

o The  Goal/Question/Metric  (GQM)  paradigm,  an  approach  for  establishing  project, 
corporate,  and  research  goals  and  a mechanism  for  measuring  against  those  goals, 

o The  Experience  Factory,  an  organization  that  supports  research  and  development 
by  studying  projects,  developing  and  refining  models,  and  supplying  them  to  projects 
for  further  analysis  and  refinement. 

The  Quality  Improvement  Paradigm  consists  of  the  following  steps: 

o Characterize  the  current  project  and  its  environment  with  respect  to  a vanety  of 
models. 

o Set  the  quantifiable  goals  for  successful  project  performance  and  improvement. 

o Choose  the  appropriate  process  model  and  supporting  methods  and  tools  for  this 
project. 

o Execute  the  processes,  construct  the  products,  collect  and  validate  the  prescribed 
data,  and  analyze  it  to  provide  real-time  feedback  for  corrective  action. 

o Analyze  the  data  to  evaluate  the  current  practices,  determine  problems,  record 
findings,  and  make  recommendations  for  future  project  improvements. 

o Package  the  experience  in  the  form  of  updated  and  refined  models  and  other  forms 
of  structured  knowledge  gained  from  this  and  prior  projects  and  save  it  in  an 
experience  base  so  it  represents  our  current  state  of  knowledge  and  is  available  for 
future  projects. 

The  research  emphasis  is  on  taking  each  of  these  issues  associated  with  the  QIP,  (e.g., 
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characterizing,  goal  setting,  choosing  process,  executing,  analyzing,  and  packaging), 
and  formalizing  and  integrating  them.  Each  of  these  steps  has  evolved  over  the  years. 
We  have  been  building  models  of  characterization.  For  example,  what  are  good 
models  that  allow  me  to  recognize  what  kind  of  software  project  I have  and  what 
projects  are  similar?  Based  on  data,  we  are  using  pattern  recognition  techniques  to 
recognize  where  to  find  the  most  appropriate  kinds  of  experiences  related  to  the 
current  project  [Briand,  Basili,  Thomas]. 

Goal  setting  has  become  a process  of  integrating  models.  A goal  typically  takes  the 
form  of  analyzing  some  form  of  object  from  some  perspective.  I need  models  of  both 
the  object  of  study  and  the  various  perspectives  of  interest  on  that  object. 

We  want  to  choose  processes.  A key  issue  here  is  that  process  is  a variable;  that  I 
need  to  select,  manipulate  and  change  processes  based  on  the  characterization  of  the 
project  and  the  environment  and  the  goals  established  for  this  particular  project. 

Execution  needs  automated  support.  An  automated  system,  SME,  has  been 
developed  to  support  the  accessing  of  data  in  a packaged  form.  The  analysis  and 
packaging  issues  are  the  major  focuses  of  this  paper. 

The  Goal/Question/Metric  Paradigm  is  a mechanism  for  defining  and  interpreting 
operational  and  measurable  software  goals.  Goals  may  be  defined  for  any  object,  for  a 
variety  of  reasons,  with  respect  to  various  models  of  quality,  from  various  points  of 
view,  relative  to  a particular  environment.  A particular  GQM  model  combines  models  cf 
an  object  of  study,  e.g.,  a process,  product,  or  any  other  experience  model  and  one  c- 
more  focuses,  e.g.,  models  aimed  at  viewing  the  object  of  study  for  particular 
characteristics,  such  as  models  of  cost,  correctness,  defect  removal,  changes, 
reliability,  user  friendliness,  etc.  This  implies  that  there  are  models  of  these  quality 
perspectives  developed  and  available  for  use  at  anytime. 

These  modals  can  be  analyzed  from  a point  of  view,  e.g.,  the  perspective  of  the  person 
needing  the  information,  which  orients  the  type  of  focus  and  when  the  interpretation  of 
the  information  is  made  available  and  for  any  purpose,  e.g.,  characterization, 
evaluation,  prediction,  motivation,  improvement,  which  specifies  the  type  of  analysis 
necessary. 

The  result  is  a GQM  model  relative  to  a particular  environment.  Environments  are 
distinguished  based  upon  a variety  of  factors,  e.g.,  problem  factors,  people  factors, 
resource  factors,  process  factors,  etc. 

Experimental  Approaches 

Given  a form  of  the  scientific  method,  in  the  guise  of  the  QIP,  a mechanism  to  generate 
research  hypotheses,  in  the  guise  of  the  GQM,  what  kinds  of  experimentation  can  we 
perform?  The  chart  in  Figure  2 offers  four  classes  of  studies  that  we  can  and  have 
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performed.  The  approaches  can  be  characterized  by  the  number  of  teams  replicating 
each  project  and  number  of  different  projects  analyzed. 
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Figure  2.  Classes  of  Studies  and  Scopes  of  Evaluation 

The  single  project  case  study  is  where  most  people  begin.  There  is  a project  and 
someone  has  decided  to  study  it.  The  results  can  provide  some  insight  into  project 
development  in  the  environment. 

A multi-project  variation  type  study  involves  the  measurement  of  several  projects 
where  factors,  such  as  a method,  can  be  varied  across  similar  type  projects.  This 
allows  the  experimenter  to  study  the  effects  of  variations  to  the  extent  that  the 
organization  allows  them  to  vary  on  different  projects.  In  fact,  that’s  literally  what  we  do 
in  the  SEL  We  have  a large  number  of  projects,  we  have  standard  baseines  of  how 
things  should  happen,  and  we  start  to  perturb  them  by  making  changes  and  studying 
the  effects  of  those  changes. 

The  replicated  project  study  involves  several  replications  of  the  same  project  by 
different  subjects.  Each  of  the  issues  studied  is  applied  to  the  project  by  several 
subjects  but  each  subject  applies  only  one  of  the  technologies.  It  permits  the 
experimenter  to  establish  control  groups. 

The  blocked  subject-project  study  allows  the  examination  of  several  factors  within  the 
framework  of  one  study.  Each  of  the  issues  studied  is  applied  to  a set  of  projects  by 
several  subjects  and  each  subject  applies  each  of  the  technologies  under  study.  It 
permits  the  experimenter  to  control  for  differences  in  the  subject  population  as  well  as 
study  the  effect  of  the  particular  projects. 
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There  are  two  problems  with  the  controlled  types  of  experiments:  (1)  they  are  rather 
expensive  and  (2)  if  done  for  large  pieces  of  software,  for  example,  one  year  duration 
projects,  they  are  hard  to  control,  especially  over  several  replications.  Therefore, 
even  though  these  types  of  experiments  generate  stronger  confidence  in  the  results 
than  the  non-controlled  type  experiments,  they  must  be  performed  on  small  projects  so 
the  results  do  not  scale  up.  If,  however,  these  experiments  are  run  on  a small  scale 
achieving  reasonable  statistical  results,  then  there  is  motivation  to  experiment  with  the 
technologies  on  a larger  scale  in  either  a case  study  or  a multi-project  variation. 
Combining  the  results  of  the  controlled  experiment  and  the  large-  scale  case  study  or 
multi-project  variation,  we  can  gain  confidence  in  the  validity  of  the  experimental 
results. 

It  is  clear  in  the  SEL  that  we  are  avid  believers  in  experimentation.  We  do  not  believe 
that  any  technology,  method,  tool,  process  model,  etc.  works  under  all  circumstances. 
Everything  has  limits,  areas  where  it  works  well  or  poorly.  If  we  are  dealing  with 
technologies,  we  know  they  have  limits.  Experimentation  is  important  in 
understanding  those  limits. 
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Figure  3.  Example  Classes  of  Studies 

Figure  3 contains  several  example  studies  we  have  performed  in  the  SEL.  These 
studies  cut  across  various  experimental  classes.  When  we  have  found  somsthing 
effective  as  a case  study,  we  eventually  turn  it  into  a multi-project  variation  because  it 
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is  effective  for  the  environment. 

An  Example  Set  of  Studies 

As  an  example  of  an  effective  process  with  which  we  have  performed  multiple  types  of 
experiments,  consider  the  Cleanroom  approach  to  software  development,  as 
suggested  by  Harlan  Mills.  We  first  ran  a replicated  project  study  at  the  University  of 
Maryland  that  showed  that  the  approach  was  very  effective.  We  then  decided  to  run  a 
case  study  here  in  the  SEL,  which  again  was  successful.  We  have  since  begun  two 
new  projects  using  the  approach  and  will  eventually  have  enough  projects  for  an 
analysis  based  upon  multi-project  variation. 

The  key  elements  of  the  Cleanroom  Process  [Dyer],  include  a mathematically-based 
design  methodology  which  includes:  function  specification  for  programs,  state 
machine  specification  for  modules,  reading  by  stepwise  abstraction,  correctness 
demonstrations  when  needed,  and  top-down  development.  The  implementation  is 
done  without  any  on-line  testing  by  the  developer.  There  is  statistically-based, 
independent  testing,  based  on  anticipated  operational  use.  Testing  is  done  from  a 
quality  assurance  orientation. 

The  replicated  Cleanroom  study  had  as  its  goals  to  evaluate  the  Cleanroom  process 
with  respect  to  its  effects  on  the  process,  product  and  developers  relative  to  differences 
from  a non-Cleanroom  process  {Selby,  Basili,  Baker].  The  experiment  was  run  at  the 
University  of  Maryland  with  15  three-person  teams, 10  using  Cleanroom.  The  project 
was  an  electronic  message  system  (~  1500  LOC).  The  teams  were  permitted  3 to  5 
test  submissions  and  the  data  collected  consisted  of  background  and  altitude  surveys, 
on-line  activities  of  the  developers,  and  test  results. 

The  effect  of  the  Cleanroom  approach  on  the  process  was  that  the  Cleanroom 
developers  (1)  felt  they  more  effectively  applied  off-line  review  techniques,  while 
others  focused  on  functional  testing,  (2)  spent  less  time  on-line  and  used  fewer 
computer  resources,  and  (3)  tended  to  make  all  their  scheduled  deliveries. 

The  effect  of  the  Cleanroom  approach  on  the  product  with  regard  to  static  properties 
was  that  the  products  developed  using  the  Cleanroom  approach  had  less  dense 
complexity,  a higher  percentage  of  assignment  statements,  more  global  data,  and 
more  comments.  With  regard  to  operational  properties,  Cleanroom  products  more 
completely  met  requirements  and  had  a higher  percentage  of  test  cases  succeed. 

Based  on  these  results,  we  decided  that  it  was  worth  running  a case  study  in  the  SEL 
to  see  if  the  approach  scaled  up  and  how  it  worked  with  changing  requirements.  In 
applying  the  approach  in  the  SEL,  you  will  see  an  application  of  the  QIP  with  regard  to 
improving  process.  We  begin  with  the  characterization  step  which  asks  the  question, 
"what  relevant  models  exist  that  are  available  for  reuse?"  There  were  three  models: 
the  standard  SEL  model,  which  defines  how  software  gets  developed  in  the  SEL  in  a 
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FORTRAN  environment;  the  IBM  FSD  Cleanroom  model  that  was  applied  on  a prior 
project,  and  the  experimental  model  we  used  for  the  replicated  project. 

The  SEL  goals  were  to  characterize  and  evaluate  the  Cleanroom  approach  in  general 
and  spedficafly  with  regard  to  changing  requirements.  In  prior  applications, 
Cleanroom  had  been  used  on  projects  where  the  requirements  were  basically  fixed  at 
the  beginning  of  the  study.  One  of  the  questions  we  were  often  asked  after  the 
replicated  project  study  was  “would  this  technology  survive  in  an  environment  with 
changing  requirements?"  Since  we  had  not  experimented  with  changing 
requirements,  we  could  not  answer  the  question  with  much  confidence. 

What  had  been  learned  from  the  IBM/Cleanroom  model  application  was  the  basic 
process  model,  methods  and  techniques  and  that  the  process  very  effective  in  the 
given  environment.  From  the  UoM/Cleanroom  model  appScation,  we  teamed  that  no 
developer  testing  enforces  better  reading,  the  process  is  quite  effective  for  small 
projects,  formal  methods  are  hard  to  apply  and  require  skin,  and  there  may  be 
insufficient  failure  data  to  effectively  measure  reliability. 

Based  upon  the  existing  models,  our  goals,  and  the  lessons  learned  from  pricr 
applications  of  Cleanroom,  we  defined  an  initial  SEL  Cleanroom  process  model.  We 
stole  what  was  most  effective  from  prior  applications;  for  example,  the  training  was 
consistent  with  the  University  of  Maryland  course  and  we  emphasized  reading  by  at 
least  two  reviewers. 

Because  this  was  a real  project,  and  there  was  concern  on  the  part  of  some  cf  the 
developers  about  the  effectiveness  of  reading,  e.g.,  that  you  needed  to  test  certain 
algorithms,  we  allowed  back-out  options,  e.g.,  you  coidd  request  permission  to  unit 
test  certain  types  of  algorithms.  These  back-out  options  were  never  used,  but  they  did 
provide  a comfort  level  for  the  developers.  When  we  didn't  know  how  to  hande  some 
aspect  of  the  approach  in  this  environment  we  applied  the  standard  SEL  procsss 
model  as  long  as  it  didn't  conflict  in  principle  with  what  we  were  -ving  to  do.  V/e 
monitored  and  made  changes  to  the  process  model  in  real-time.  We  wrote  lessons 
learned,  and  we  redefined  the  process  for  the  next  time  out 

Some  of  the  major  positive  results  of  the  application  of  Cleanroom  in  the  SEL  include: 
the  approach  scales  up  to  a 30,000  SLOC  project,  it  can  be  used  with  changing 
requirements,  productivity  increased  by  about  30%,  the  failure  rate  during  test  reduced 
to  close  to  50%,  there  was  a reduction  in  rework  effort  (95%  of  the  fixes,  as  opcosed  to 
58%,  took  < 1 hour  to  fix),  only  26%  of  faults  found  by  both  readers  (implying  two 
readers  are  important),  there  were  effort  distribution  changes,  e.g.,  more  time  n design 
and  50%  of  code  time  spent  reading,  code  appears  in  library  later  than  normal  and 
more  ike  a step  function,  there  was  less  computer  use  by  a factor  of  5. 

Negative  lessons  learned  include  the  fact  that  better  training  was  needed  for  the 
methods  and  techniques.  The  kind  of  training  we  had  at  the  university  wasn't  good 
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enough.  For  example,  we  provided  training  where  the  examples  were  stacks,  etc. 
This  was  not  appropriate  for  the  application.  (One  thing  we  have  done  on  the  second 
two  Cleanroom  projects  is  reuse  parts  of  the  first  project  as  examples  in  the  training.) 
We  needed  better  mechanisms  for  transferring  code  to  testers  and  the  testers  need  to 
add  requirements  for  output  analysis  of  code.  As  expected,  we  did  not  have  enough 
error  data  (with  a 30,000  line  project)  to  seed  the  reliability  model  so  there  was  no 
payoff  in  reliability  modeling  in  the  SEL. 

A side  effect  of  this  project  was  that  it  generated  much  more  interest  in  improving  the 
requirements.  This  requirements  problem  existed  independent  of  Cleanroom,  but  the 
approach  exposed  the  problem.  So  there  has  been  a genuine  push  in  having  better 
defined  requirements. 

These  results  were  for  a 30,000  line  project  and  a particular  application.  Is  that  the 
size  limit  for  the  Cleanroom  process?  Suppose  we  try  a 100,000  line  project ...  what 
are  the  limits  of  this  particular  technology?  When  does  it  start  to  fall  apart?  Even  if  it 
doesn’t  work  for  a given  size  project,  that’s  okay ...  we  now  understand  the  bounds  on 
that  technology.  It  should  not  be  expected  that  a technology  works  under  all 
circumstances,  every  time,  and  every  place.  We  have  to  understand  as  a community 
that  technology  has  limits  and  that  we  have  to  select,  and  modify  processes 
appropriate  for  the  situation. 

The  next  two  experiments  will  emphasize  the  application  of  the  formal  models  more, 
we  are  using  the  box  structure  approach,  a change  in  the  application  domain  for  one 
project,  and  a scale  up  to  a 1 00  KLOC  for  the  other  project. 

This  has  been  an  example  of  the  Quality  Improvement  Paradigm  in  terms  of  a 
particular  process,  and  in  terms  of  experimental  design  moving  from  controlled 
experiments  to  case  studies  in  a real  environment,  and  moving  from  case  study  to 
multi-project  environment. 

And  we  continue  to  evolve. 

Packaging  the  Experience 

We  have  just  discussed  a form  of  packaging,  the  documentation  of  the  Cleanroom 
process  model.  We  currently  have  a working  document  that  represents  the  model  as 
we  understand  it  today.  But  it  will  change  as  we  learn! 

Packaging  experience  requires  the  continual  accumulation  of  evaluated  experiences 
(learning)  in  a form  that  can  be  effectively  understood  and  modified  (experience 
models)  into  a repository  of  integrated  experience  models  (experience  base)  that  can 
be  accessed  and  modified  to  meet  the  needs  of  the  current  project  (reuse). 

Systematic  learning  requires  support  for  recording  experience  off-line  generalizing 
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and  tailoring  of  experience  formalizing  experience.  Off-line  is  a key  word  here. 
Packaging  cannot  be  done  as  part  of  a project  development.  Someone  cannot 
perform  this  analysis  and  build  models  at  the  same  time  they  are  building  software. 
There  needs  to  be  a separate  organization,  either  physically  or  logically  separate. 

Packaging  useful  experience  requires  a variety  of  models  and  an  experience  base. 
The  models  require  formal  notations  that  are  tailorabie,  extendible,  understandable, 
flexible  and  accessible.  An  effective  experience  base  must  contain  accessible  and 
integrated  set  of  analyzed,  synthesized,  and  packaged  experience  models  that 
captures  the  tocaJ  experiences. 

The  Experience . Factory  is  a logical  and/or  physical  organization  (sepa  ate  from 
the  project  organization)  that  supports  project  developments  by  analyzing  and 
synthesizing  all  kinds  of  experience  models  acting  as  a repository  for  such  experience 
supplying  that  experience  to  various  projects  on  demand.  It  packages  experience  by 
buikfing  informal,  formal  or  schematized,  and  productized  models  and  measures  of 
various  software  processes,  products,  and  other  forms  of  knowledge  via  people, 
documents,  and  automated  support 

There  are  a variety  of  software  engineering  experiences  that  we  can  package: 
resource  baselines  and  models,  change  and  defect  baselines  and  models,  product 
baselines  and  models,  process  definitions  and  models,  method  and  technique  models 
and  evaluations,  products,  lessons  learned,  quality  models,  etc.  In  the  SEL,  they  exist 
in  the  form  of  standards,  policies,  tools.  The  documents  range  from  sets  of  lessons 
learned  to  a manager’s  handbook. 

There  are  many  forms  of  packaged  experience.  We  can  use  mathematical  equations 

defining  the  relationship  between  variables,  e.g..  Effort  = a* Size*3.  We  can  present 
raw  or  analyzed  data  in  the  form  of  histograms  orpie  charts,  e.g.,  % of  each  class  of 
fault  We  can  plot  graphs  defining  ranges  of  “normar,  e.g.,  graphs  of  size  growth  over 
time  with  confidence  levels.  We  can  write  specific  lessons  learned  associated  with 
project  types,  phases,  or  activities,  e.g.,  reading  by  stepwise  abstraction  is  most 
effective  for  finding  interface  faults,  or  in  the  form  of  risks  or  recommendations,  e.g., 
definition  of  a unit  for  unit  test  in  Ada  needs  to  be  carefully  defined.  We  can  create 
models  or  algorithms  specifying  the  processes,  methods,  or  techniques,  e.g.,  an 
SADT  diagram  defining  Design  Inspections  with  the  reading  technique  a variable 
dependent  upon  the  focus  and  reader  perspective. 

For  example,  in  the  SEL  we  have  a whole  set  of  equations  that  define  the  relationships 
between  a variety  of  variables  [Basili.  Panlilio-Yap].  Management  can  use  these 
equations  to  understand,  predict,  and  evaluate.  In  the  SEL,  example  packaged 
relationships  include: 

Effort  = 4.37  + 1.43devlines 
Effort  = 5.5  + 1.5newfines 
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Docpages  » 99.1  + 30.9  devlines 
Numruns  = -108  + 151  devlines 
For  projects  under  50  KLOC  we  have: 

Effort  = .877  + 1.5newlines 
while  for  projects  over  50  KLOC  we  have: 

Effort  = 66.9  + .003  numruns 

We  have  been  able  to  demonstrate  that  methodology  favorable  impacts  software  cost 
and  quality  but  cumulative  complexity  unfavorable  impacts  these  factors  [Basili  a). 

We  have  fault  profiles  that  allow  us  to  compare  and  analyze  environments  and 
projects.  For  example,  what  percent  of  faults  of  a particular  type,  based  on  a particular 
classification  scheme,  occur  during  a standard  FORTRAN  development.  Are  the 
percentages  the  same  for  an  Ada  development?  We  have  been  able  to  show  that  Ada 
reduces  the  percent  of  interface  faults,  but  not  by  the  amount  one  might  expect  based 
upon  the  ability  of  Ada  compilers  to  check  for  interface  faults  [Brophy]. 

Conclusions 

Based  upon  our  experiences,  we  need  a set  of  experience  factories  or  SELs,  each 
focused  on  packaging  local  experiences  by  building  and  tailoring  local  models, 
integrating  technologies,  studying  scale-up,  building  experience  bases,  and 
developing  automated  aids. 

It  is  still  hard  to  answer  questions  like:  how  big  should  an  SEL  be?  should  the 
experience  factory  only  be  domain  specific,  should  it  focus  on  a homogeneous 
environment? 

If  the  SELs  are  focussed  on  homogeneous  environments,  we  will  need  to  integrate 
these  local  experience  factories  into  a high  level  experience  factory  that  abstracts  from 
local  experiences,  looks  for  patterns  across  environments,  and  generates  the  basic 
models  of  the  science.  But  how  is  this  accomplished? 

What  we  can  do  now  is  take  advantage  of  the  experimental  nature  of  software 
engineering.  Processes,  products,  and  environments  can  be  measured  and  can  be 
used  to  support  practical  development  and  research.  The  integration  of  the 
Improvement  Paradigm,  the  Goal/Question/Metric  Paradigm,  and  the  Experience 
Factory  Organization  can  provide  a framework  for  both  development  and  research. 

Based  upon  our  experience,  it  helps  us  derive  descriptive  models  of  our  experiences, 
understand  our  experiences  and  our  problems,  evaluate  and  learn  from  our 
experiences,  and  build  effective  prescriptive  models  of  our  experiences  and  our 
quality  objectives.  It  can  and  should  be  applied  today  and  evolve  with  technology. 

Taking  advantage  of  the  experimental  nature  of  software  engineering  has  provided  a 
winning  situation  for  research  and  development.  From  a researcher’s  perspective  the 
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SEL  has  been  a smashing  success.  Its  evolution  has  been  slow,  we  have  made  many 
mistakes,  but  we  have  learned  a lot.You  don’t  have  to  make  the  same  mistakes  we 
did,  you  can  learn  from  our  experiences. 
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Software  Engineering  Research 


There  is  a great  deal  of  software  engineering 
research  going  on,  i.e.,  people  are  building 
technologies,  methods,  models,  etc. 


What  is  the  problem? 

The  research  is  mostly  bottom-up,  done  in  isolation 

It  cannot  be  easily  logically  or  physically  integrated 

It  is  not  aimed  at  solving  the  big  problem 

It  is  not  evaluated  or  analyzed  via  experimentation 

It  is  not  refined  and  tailored  to  the  application 
environment 

It  cannot  be  easily  transferred  into  practice 

We  cannot  understand  the  relationships  between 
various  models  of  the  process  and  product 


Software  Engineering  Research 


What  is  needed? 


A top  down  experimental,  evolutionary  framework 
in  which  research  can  be  focused,  logically  and 
physically  integrated  to  produce  quality  software 
productively,  and  evaluated  and  tailored  to  the 
application  environment 

An  experimental  laboratory  that  is  associated  with 
the  artifact  we  are  studying 

We  need  SEL  type  activities  to  support  software 
engineering  research 
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What  is  the  SEL 

from  a researchers  point  of  view? 

A laboratory  that  allows  us  to 

understand  the  various  processes,  products  and 
other  experiences  and  build  descriptive  models 

understand  the  problems  associated  with  building 
software 

develop  solutions  focused  on  the  problems, 
experiment  with  them  and  analyze  and  evaluate 
their  effects 

refine  and  tailor  these  solutions  for  continual 
improvement  and  effectiveness  and  enhance  our 
understanding  of  their  efforts 

build  models  of  software  engineering  experiences 


V.Easilf 


How  have  the  activities  evolved? 


Evolving  concepts  for  over  15  years 
Phase  1 

Worked  on  understanding  what  we  could  about  the 
environment  and  measurement 
measured  what  we  could 
collected  too  much  data 
used  available  models 

blindly  applied  models  and  metrics 
tried  to  evaluate  before  understanding 
built  descriptive  baselines  and  models 
studied  individual  projects 
tried  to  characterize  the  environment 
developed  the  Goal/Question/Metric  Paradigm 
informal  approach  to  organizing  data 
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How  have  the  activities  evolved? 


Phase  2 

Worked  on  improving  the  process  and  product 
evaluated  and  fed  back  information  to  project 
mostly  informal  data  models 
data  automated  but  not  the  models 
experimented  with  technologies 

began  to  understand  effects  locally 
developed  the  Quality  Improvement  Paradigm 
informal  applied  for  cross  project  learning 
evolved  the  Goal/Question/Metric  Paradigm 
recognized  types  of  goals  and  questions 
begarr  formalizing  process,  product,  knowledge 
and  quality  models 


How  have  the  activities  evolved? 


Phase  3 

Working  on  packaging  experiences  for  reuse 
choosing  potentially  reusable  experiences 
recognizing  what  is  appropriate  for  SEL 
studying  notations  for  defining  experiences 
a project  characterization  model 
defining  models  of  tailorable  experiences 
a tailorable  test  method 
product  reuse  models/architectures 
defining  process  models  for  reusing  experience 
defining  a reuse  oriented  evolution  model 
working  on  integrating  experience  models 
developed  the  Experience  Factory  concept  and 
integrated  it  with  an  evolved  QIP  and  GQM 
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Evolution  of  Measurement/Studies  in  the  SEL 


Understands 

Modeling  environment 
Data  Collection  (GQM) 
Resource  Baselines 
Defect  Baselines 


Improving 

Methodology  Evaluation 
Cost  Model  Analysis 
Test  Technique  Analysis 
QIP 


Design  Measures 

Cost  vs.  Size  Complexity 


Packaaim 


SEL  Ada  Process 

SEL  Cleanroom  Process 

SME 

Managers  Handbook 
Experience  Factory 


Ada 

OOD 

Cleanroom 

CASE 

Test  Method 
Reuse 
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Overview  of  the  Current 
Framework 


Quality  Improvement  Paradigm 

an  evolutionary  improvement  paradigm,  based  upon 
the  scientific  method,  tailored  for  the  software 
engineering 

Goal/Question/Metric  Paradigm 

an  approach  for  establishing  project,  corporate,  and 
research  goals  and  a mechanism  for  measuring 
against  those  goals 

Experience  Factory 

an  organization  that  supports  research  and 
development  by  studying  projects,  developing  and 
refining  models,  and  supplying  them  to  projects  for 
further  analysis  and  refinement 
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Quality  Improvement  Paradigm 


* 


Characterize  the  current  project  and  its 
environment  with  respect  to  a variety  of  models. 

Set  the  quantifiable  goals  for  successful  project 
performance  and  improvement. 

Choose  the  appropriate  process  model  and 
supporting  methods  and  tools  for  this  project. 

Execute  the  processes,  construct  the  products, 
collect  and  validate  the  prescribed  data  ,and  analyze 
it  to  provide  real-time  feedback  for  corrective 
action. 

Analyze  the  data  to  evaluate  the  current 
practices,  determine  problems,  record  findings,  and 
make  recommendations  for  future  project 
improvements. 

Package  the  experience  in  the  form  of  updated  and 
refined  models  and  other  forms  of  structured 
knowledge  gained  from  this  and  prior  projects  and 
save  it  in  an  experience  base  so  it  represents  our 
current  state  of  knowledge  and  is  available  for 
future  projects. 
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The  Goal  Question  Metric  Paradigm 


A mechanism  for  defining  and  interpreting 
operational  and  measurable  software  goals 

It  combines  models  of 

an  object  of  study,  e.g.,  a process,  product,  or 
any  other  experience  model  and 

one  or  more  focuses,  e.g.,  models  aimed  at  viewing 
the  object  of  study  for  particular  characteristics 

that  can  be  analyzed  from  a point  of  view,  e.g.,  the 
perspective  of  the  person  needing  the  information, 
which  orients  the  type  of  focus  and  when  the 
interpretation/information  is  made  available 

for  any  purpose,  e.g.,  characterization,  evaluation, 
prediction,  motivation,  improvement,  which 
specifies  the  type  of  analysis  necessary 

to  generate  a GQM  model 

relative  to  a particular  environment 
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Classes  of  Studies 
Scopes  of  Evaluation 


#Projects 


One 

| More  than 
j one 

# of 

One 

Single  Project 

| Multi-Project 

Teams  'j: 

(Case  Study) 

| Variation 

» 

per 

More  than 

' Replicated 

| Blocked 

Project 

one 

Project 

| Subject-Project 
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Classes  of  Studies 
Examples 


Single  Project 

| Multi-Project 

(Case  Study) 

j Variation 

Independent  V&V 

| Effect  of  Methodology 
1 

Cleanroom  Process 

| Resource  Model  Studies 

Defect  Analysis  Studies 

Ada/Object  Oriented  Design 

Code  Reuse  in  Ada/Fortran 

Replicated 

| Blocked 

Project 

j Subject-Project 

Effect  of  Methodology 

| Reading  vs.  Testing 

Cleanroom  Process 

1 

i 

| 

Ada/O-O  Design 

1 

1 

V.Sauli 

Lor*,  of  Maryland 
Pate  27  of  42 


Cleanroom  Process 


Key  components: 

Mathematically-based  design  methodology 
Function  specification  for  programs 
State  machine  specification  for  modules 
Reading  by  stepwise  abstraction 
Correctness  demonstrations  when  needed 
Top-down  development 

Implementation  without  any  on-line  testing  by 
developer 
Independent  testing 

Statistically  based  on  anticipated 
operational  use 
Quality  assurance  orientation 
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Replicated  Cleanroom  Study 

Study  Goal: 

Analyze  the  Cleanroom  process  to  evaluate  it 
with  respect  to  the  effects  on  the  process, 
product  and  developers  relative  to 
differences  from  a non-Cleanroom  process 

Environment: 

University  of  Maryland 

Electronic  message  system  (~  1500  LOC) 

15  three-person  teams  (10  used  Cleanroom) 

Empirical  study: 

3 to  5 test  submissions 
Data  collected 
Background 
Attitude  survey 
On-line  activity 
Testing  results 
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Replicated  Cleanroom  Study 


EFFECT  ON  PROCESS 

Cleanroom  developers  felt  they  more  effectively 
applied  off-line  review  techniques,  while  others 
focused  on  functional  testing 

Cleanroom  developers  spent  less  time  on-line  and 
used  fewer  computer  resources 

Cleanroom  developers  tended  to  make  all  their 
scheduled  deliveries 

EFFECT  ON  PRODUCT 

Static  properties: 

Less  dense  complexity 
Higher  percentage  of  assignment  statements 
More  global  data 
More  comments 
Operational  properties: 

Product  more  completely  met  requirements 
Higher  percentage  of  test  cases  succeeded 
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1 DEFINING  AN  SEL  CLEANROOM 

f PROCESS  MODEL 

« 

t 

J Existing  models:  standard  SEL  model, 

j‘  IBM/FSD  Gleanroom  Model 

experimental  UoM  Cleanroom  model 

4 

1 

1 

Goals:  characterize  and  evaluate  in  general, 
i and  with  respect  to  changing  requirements 

• IBM/Cleanroom  model  lessons  learned: 

I basic  process  model,  methods  and  techniques 

process  very  effective  in  given  environment 


UoM/Cieanroom  model  lessons  learned: 
no  te?+ing  er  orces  better  reading 
process  quite  effective  for  small  project 
forma!  methods  hard  to  apply,  require  skill 
may  have  insufficient  data  to  measure 
reliability 
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DEFINING  AN  SEL/CLEANROOM 
PROCESS  MODEL  (Cont.) 

Define  SEL/Cleanroom  process  model: 

Use  informal  state  machine  and  functions 
Training  consistent  with  UoM  course  on  process 
model,  methods,  and  techniques 
Emphasize  reading  by  two  reviewers 
Allow  back-out  options  for  unit  testing  certain 
modules 

When  no  new  information,  use  standard  SEL 
activities 

Monitor  and  make  changes  to  the  process  model  in 
real  time 

Write  lessons  learned  for  incorporation  into  next 
version 

Redefine  process  for  the  next  execution  of  the 
process  model 
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SOME  LESSONS  LEARNED  USING 
CLEANROOM  in  the  SEL 


Can  scale  up  to  30KLOC 

Can  use  with  changing  requirements 

Failure  rate  during  test  reduced  to  close  to  50% 

Reduction  in  rework  effort 

95%  as  opposed  to  58%  took  < 1 hour  to  fix 

Only  26%  of  faults  found  by  both  readers 

Productivity  increased  by  about  30% 

Effort  distribution  changes: 
more  time  in  design 
50%  of  code  time  spent  reading 

Code  appears  in  library 
later  than  normal 
more  like  a step  function 

Less  computer  use  by  a factor  of  5 
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SOME  LESSONS  LEARNED  USING 
CLEANROOM  in  the  SEL  (Cont.) 

Better  training  needed  for  methods  and  techniques 

Better  mechanisms  needed  for  transferring  code  to 
testers 

Testers  need  to  add  requirements  for  output 
analysis  of  code 

No  payoff  in  reliability  modeling 

Side  effects: 

Caused  more  emphasis  on  requirements  analysis 

Define  next  experiments: 

Apply  formal  models  more  effectively  - use  box 
structure  approach 

Change  application  domain  and  keep  size  the  same 
Scale  up  to  a 100KLOC  project 
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Packaging  the  Experience 


Packaging  requires  the 

continual  accumulation  of  evaluated  experiences 

(learning) 

in  a form  that  can  be  effectively  understood 
and  modified  (experience  models) 
into  a repository  of  integrated  experience 
models  (experience  base) 
that  can  be  accessed  and  modified  to  meet  the 
needs  of  the  current  project  (reuse) 

Systematic  learning  requires  support  for 
recording  experience 

off-line  generalizing  and  tailoring  of  experience 
formalizing  experience 

Packaging  useful  experience  requires 

a variety  of  models  and  formal  notations  that 
are  tailorable,  extendible,  understandable, 
flexible  and  accessible 

An  effective  experience  base  must  contain 
accessible  and  integrated  set  of  analyzed, 
synthesized,  and  packaged  experience  models 
that  captures  the  local  experiences 
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The  Experience  Factory 


Logical  and/or  physical  organization  (separate  from 
the  project  organization)  that  supports  project 
developments  by 

analyzing  and  synthesizing  all  kinds  of 
experience  models 

acting  as  a repository  for  such  experience 

supplying  that  experience  to  various  projects 
on  demand 


It  packages  experience  by  building 

informal,  formal  or  schematized,  and 
productized  models  and  measures 

of  various  software  processes,  products,  and 
other  forms  of  knowledge 

via  people,  documents,  and  automated  support 
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What  kinds  of  experience  can  we 

package? 


Resource  Baselines  and  Models 

Change  and  Defect  Baselines  and  Models 

Product  Baselines  and  Models 

Process  Definitions  and  Models 

Method  and  Technique  Models  and  Evaluations 

Products 

Lessons  Learned 

Quality  Models 

In  the  SEL,  they  exist  in  the  form  of  standards , 
policies,  tools 
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Forms  of  Packaged  Experience 


Equations  defining  the  relationship  between 
variables, 

e.g.  Effort  = a*Size^ 

Histograms  or  pie  charts  of  raw  or  analyzed  data 
e.g.  % of  each  class  of  fault 

Graphs  defining  ranges  of  “normal" 

e.g.  graphs  of  size  growth  over  time  with 
confidence  levels 

Specific  lessons  learned 

associated  with  project  types,  phases,  activities 
e.g.  reading  by  stepwise  abstraction  is  most 
effective  for  finding  interface  faults 
in  the  form  of  risks  or  recommendations 
e.g.  definition  of  a unit  for  unit  test  in  Ada 
needs  to  be  carefully  defined 

models  or  algorithms  specifying  the  processes, 
methods,  or  techniques 

e.g.  an  SADT  diagram  defining  Design 

Inspections  with  the  reading  technique  a 
variable  dependent  upon  the  focus  and 
reader  perspective 
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PACKAGING  EXPERIENCE: 

RESOURCE  MODELS 


In  the  SEL, 

Example  packaged  relationships  include: 

Effort  = 4.37  + 1.43devlines 
Effort  = 5.5  + 1.5newlines 
Docpages  = 99.1  + 30.9  devlines 
Numruns  = -108  + 151devlines 

Projects  under  50kloc: 

Effort  = .877  + 1.5newlines 
Projects  over  50kloc 
Effort  = 66.9  + .003  numruns 

Factors  that  affect  cost  and  quality  are: 
+methodology  (favorable  impact) 

-cumulative  complexity  (unfavorable  impact) 
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CLASSES  OF  ERROR* 


FORTRAN 


Ada 


•ERROR  PROFILES  QUITE  SIMILAR;  EVEN  FOR  DIFFERENT  LANGUAGES 
•Ada  SOMEWHAT  FEWER  INTERFACE  ERRORS 


'BASEO  ON  ERROR  FROM  5 Ada  AND  a FORTRAN  PROJECTS 
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Research  Laboratory  Needs 


We  need  a set  of  SELs  or  Experience  Factories 
each  focused  on  packaging  local  experiences  by 
building  and  tailoring  local  models 
integrating  technologies 
studying  scale-up 
building  an  experience  bases 
developing  automated  aids 
How  big  should  an  SEL  be? 

Should  it  only  be  domain  specific? 
and 

the  integration  of  these  local  experience  factories 
into  a high  level  Experience  Factory  that 
abstract  from  local  experiences 
looks  for  patterns  across  environments 
generates  the  basic  models  of  the  science 
How  is  this  accomplished? 
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Conclusions 


* 


We  can  take  advantage  of  the  experimental  nature  of 
software  engineering 

Process,  product,  environment  can  be  measured  and 
can  be  used  to  support  practical  development  and 
research 

Integration  of  the 

Improvement  Paradigm 
Goal/Question/Metric  Paradigm 
Experience  Factory  Organization 
provides  a framework  for  both  development  and 
research 

Based  upon  our  experience,  it  helps  us 
derive  descriptive  models  of  our  experiences 
understand  our  experiences  and  our  problems 
evaluate  and  learn  from  our  experiences 
build  effective  prescriptive  models  of  our 
experiences  and  our  quality  objectives 

Should  be  applied  today  and  evolve  with  technology 

You  don’t  have  to  make  the  same  mistakes  we  did, 
you  can  learn  from  our  experiences 
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IMPACT  OF  A PROCESS  IMPROVEMENT  PROGRAM  IN  A 
PRODUCTION  SOFTWARE  ENVIRONMENT: 

ARE  WE  ANY  BETTER? 
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ABSTRACT 

For  the  past  15  years,  Computer  Sciences  Corporation  (CSC)  has  participated 
in  a process  improvement  program  as  a member  of  the  Software  Engineering 
Laboratory  (SEL),  which  is  sponsored  by  the  National  Aeronautics  and  Space 
Administration  (NASAyGoddard  Space  Flight  Center  (GSFC).  This  paper 
analyzes  the  benefits  CSC  has  derived  from  involvement  in  this  program.  In 
the  environment  studied,  it  shows  that  improvements  were  indeed  achieved,  as 
evidenced  by  a decrease  in  error  rates  and  costs  over  a period  in  which  both  the 
size  and  the  complexity  of  the  developed  systems  increased  substantially.  The 
paper  also  discusses  the  principles  and  mechanics  of  the  process  improvement 
program,  the  lessons  CSC  has  learned,  and  how  CSC  has  capitalized  on  these 
lessons. 


INTRODUCTION 

Computer  Sciences  Corporation  (CSC)  had 
some  compelling  motivations  to  join  with 
the  National  Aeronautics  and  Space  Admin- 
istration (NASAyGoddard  Space  Flight 
Center  (GSFC)  and  the  University  of  Mary- 
land 15  years  ago  to  form  the  Software  Engi- 
neering Laboratory  (SEL).  In  the  context  of 
1976  and  our  partnership  with  GSFC,  we 
wanted  to  study  our  overall  flight  dynamics 


software  development  process  closely 
enough  to  be  able  to  refine  and  improve  it. 
Even  then,  we  knew  we  had  to  be  able  to  ac- 
curately describe  and  measure  that  process 
before  real  improvements  could  be  made. 
Slowly  and  steadily,  we  embarked  on  a con- 
scious process  improvement  program  that 
would  help  us  produce  the  larger  and  more 
complex  flight  dynamics  ground  systems  re- 
quired to  support  the  more  sophisticated 
spacecraft  being  built. 
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We  wanted  to  build  these  complex  systems 
with  more  reliability  and  greater  economy. 
Our  personnel  were  already  committed  tc 
building  quality  systems;  what  we  needed 
now  was  to  build  quality  systems  more  pro- 
ductively. We  also  needed  to  expand  the 
skills  of  our  current  personnel  and  to  attract 
and  retain  new  personnel  who  would  enjoy 
the  twin  challenges  of  doing  flight  dynamics 
work  and  simultaneously  trying  to  improve 
the  methods  used  to  do  that  work. 

As  competition  to  provide  flight  dynamics 
services  increased  both  here  and  abroad, 
CSC  became  more  ambitious  in  efforts  to 
improve  its  processes  and  products  and 
more  committed  to  allocating  the  resources 
needed  to  make  these  improvements.  We 
wanted  to  validate  our  belief  that  higher 
quality  at  lower  costs  was  not  a contradic- 
tion. We  wanted  to  show  that,  in  fact,  those 
traits  go  hand  in  hand  and  that  high-quality 
software  really  does  cost  less. 

Sound  business  practices  showed  a need  to 
move  forward,  not  only  to  improve  on  our 
current  work  but  to  seek  new  opportunities 
asweil.  Oneway  to  enter  these  new  business 
areas  was  to  objectively  demonstrate 
superior  products  and  performance  in  our 
work  with  GSFC.  Another  way  was  to  pur- 
sue and  achieve  formal  recognition  by  other 
members  of  our  industry.  Our  motivations 
for  the  SEL  partnership  were  clear  and 
compelling.  From  our  participation  in  the 
SEL,  we  expected  to  capture  specific  gains, 
to  learn  some  vital  lessons,  and  to  demon- 
strate, over  time,  that  we  were  truly  “getting 
better*’  at  doing  flight  dynamics  work. 

Have  we  achieved  these  goals  after  15  years 
of  participation  in  the  SEL?  The  rest  of  this 
paper  answers  this  question.  It  describes 
the  principles  and  mechanics  of  the  SEL 
process  improvement  program,  including 
examples  of  the  program  in  action:  examines 
what  we  have  learned  from  our  role  in  the 


program  and  how  we  have  capitalized  on 
that  learning;  and  analyzes  trends  over  the 
past  15  years  to  determine  quantitatively 
whether  or  not  we  have  met  our  objectives. 

SEL  BACKGROUND 

The  SEL 

The  SEL  (Reference  1)  is  a research  project 
sponsored  by  NASA/GSFC  and  supported 
by  the  Computer  Science  Department  at  the 
University  of  Maryland  and  by  CSC.  The 
SELs  mission  is  to  understand  and  improve 
the  overall  software  development  process. 
To  do  this,  the  SEL  conducts  experiments 
with  production  software  projects,  measures 
the  effect  of  the  techniques  applied,  and 
then  adopts  the  most  beneficial  methodolo- 
gies for  future  projects. 

The  SEL  Environment 

The  production  software  environment  stud- 
ied by  the  SEL  is  ^n  environment  of  similar 
flight  dynamics  applications  developed  by 
GSFC  for  such  spacecraft  problems  as  atti- 
tude and  orbit  determination  and  control, 
mission  planning,  and  maneuver  control. 
These  applications  are  largely  scientific  and 
mathematical,  with  moderate  reliability  re- 
quirements and  severe  development  time 
constraints  imposed  by  a fixed  spacecraft 
launch  date.  Table  1 summarizes  the  current 
characteristics  of  this  environment. 

The  SEL  Process  Improvement 
Program 

The  SEL  process  improvement  program  is  a 
conscious,  continuous  effort  to  build  higher 
quality  systems  at  lower  costs  by  under- 
standing the  environment,  measuring  and 
evaluating  the  results  of  planned  process 
changes,  and  capturing  and  packaging  expe- 
rience to  optimize  the  process  and  to  antici- 
pate uncontrollable  changes. 
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Characteristics  of  the 
Development  Environment 
Studied  by  the  SEL 
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Characteristics 

Current  State 

Organization  size 

>250  people 

Computing  environment 

HOS  8063  (IBM  3083) 
VAX  8820,  11/780 

Languages 

FORTRAN,  Ada 

Applications 

Primarily  attitude;  some 
orbit  and  mission 
analysis 

Average  system  size 

180  KSLOC 

Average  project  duration 

2 years 

Average  staff  level 

15  to  20  people 

Staff  background 

Computer  Science, 
Mathematics,  Physics 

For  a process  improvement  program  to 
succeed,  it  must 

• Be  a conscious  effort  Improve- 
ments will  not  happen  by  themselves; 
resources  must  be  allocated  to  make  them 
happen. 


• Be  a continuous  effort  Even  very 
mature  processes  need  to  be  refined  in  the 
face  of  changing  environments  and  ad- 
vances in  technology. 

• Be  built  on  a solid  understanding  of 
the  environment  This  includes  characteriz- 
ing the  products  produced  and  processes 
used. 


• Achieve  understanding  by  measure- 

ment and  evaluation.  The  parameters  of  the 
environment  must  be  quantified  to  evaluate 
the  effectiveness  of  changes  made  to  it. 


• Feed  back  lessons  leaned.  The  re- 
sults of  measurement  and  evaluation  must 
be  fed  back  into  the  process  to  optimize  it. 

• Package  lessons  learned . Experi- 
ences must  be  packaged  so  that  managers 
can  apply  them  to  their  day-to-day  chal- 
lenges and  can  anticipate  changes  outside  of 
their  control,  thus  preserving  corporate 
legacy  when  experienced  people  leave. 

EVOLVING  TO  AN 
OPTIMIZING  ENVIRONMENT 

Given  the  principles  of  the  SEL  process  im- 
provement program,  wc  can  now  look  at  that 
program  in  action  over  the  SELs  first 
15  years.  For  convenience,  SEL  activities 
are  grouped  into  three  bread  classes: 
evaluating  changes  to  life-cycle  processes, 
evaluating  changes  to  technology  and  meth- 
odology, and  providing  support  to  the  devel- 
opment organization. 

Changing  Life-Cycle  Processes 

A first  goal  of  the  SEL  was  to  establish  a 
measurement  program  to  capture  and  quan- 
tify the  characteristics  of  the  environment, 
including  all  its  processes  and  products. 
The  SEL  spent  much  of  the  first  5 years  sim- 
ply learning  how  to  collect,  analyze,  and  in- 
terpret data.  This  early  analysis  showed  that 
testing  was  one  of  the  weakest  activities  in 
the  flight  dynamics  development  process, 
and  it  set  the  stage  for  several  early  experi- 
ments in  changing  a life-cycle  process. 

The  goal  in  changing  a life-cycle  process  is  to 
identify  a particular  life-cycle  phase  or  activ- 
ity as  a candidate  for  improvement,  vary  just 
that  one  element  of  the  process,  and  then 
measure  the  impact  on  the  process  and 
product.  If  the  analysis  shows  that  the 
change  favorably  affects  quality  and/or  pro- 
ductivity, it  is  incorporated  into  the  process. 
In  essence,  this  type  of  change  can  be  viewed 
as  “fine-tuning"  an  existing  process. 
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In  1981,  in  a step  to  understand  the  weak- 
nesses perceived  in  testing,  the  SEL  eval- 
uated the  impact  of  independent  verification 
and  validation  (IV&V)  in  the  flight 
dynamics  environment  (Reference  2).  It 
applied  IV&V  techniques  on  four  flight  dy- 
namics projects,  defined  metrics  for  analyz- 
ing the  change,  and  compared  these  metrics 
with  those  of  earlier  projects  that  did  not  use 
IV&V.  The  results  showed  little  or  no  signif- 
icant  improvement  in  quality  and  reliability 
and,  at  the  same  time,  reflected  a substantial 
increase  in  development  cost  The  study 
concluded  that  IV&V  was  not  cost  effective 
for  use  in  the  SEL  flight  dynamics  environ- 
ment 

In  1984,  continuing  its  quest  to  improve  test- 
ing, the  SEL  compared  three  different  soft- 
ware verification  techniques  (Reference  3). 
It  trained  a group  of  professional  program- 
mers in  structural  testing,  functional  testing, 
and  the  peer  review  technique  of  code  read- 
ing, and  then  gave  them  programs  that  had 
been  seeded  with  errors  on  which  to  apply 
these  techniques.  After  the  experimenters 
calculated  such  metrics  as  the  number  of  er- 
rors found  and  the  average  effort  expended 
to  find  each  error,  they  concluded  that  code 
reading  was  the  most  cost-effective  tech- 
nique for  uncovering  errors  in  software 
units.  \s  a result,  code  reading  was  incorpo- 
ratt  as  a formal  activity  into  the  flight  dy- 
namics software  development  process. 

By  participating  in  these  life-cycle  process 
change  experiments,  CSC  has  learned 
several  lessons: 

• To  effectively  evaluate  and  imple- 

ment life-cycle  changes,  resources  must  be 
allocated;  that  is,  an  independent  organiza- 
tion like  the  SEL  must  be  designated,  to 
focus  on  measuring  and  evaluating  impacts. 
The  job  is  too  big  for  managers  to  do  in  their 
“spare  time.”  We  have  carried  this  lesson 
beyond  the  SEL  environment  by  establish- 
ing software  engineering  process  groups  to 


perform  this  type  of  analysis  across  the 
entire  Systems,  Engineering,  and  Analysis 
Support  (SEAS)  contract  (Reference  4)  cur- 
rently being  performed  for  GSFC. 

• Peer  review  techniques  are  a 
cost-effective  method  for  isolating  errors 
early  in  the  development  life  cycle.  We  have 
made  such  techniques  a fundamental  part  of 
our  SEAS  System  Development  Method- 
ology (Reference  5). 

Changing  Technology/Methodology 

After  about  the  first  5 years  of  studying  the 
flight  dynamics  environment  and  its  devel- 
opment process  and  experimenting  with 
life-cycle  process  changes,  the  SEL  looked 
back  on  its  experiences  and  drew  some  basic 
conclusions.  One  was  that  following  a for- 
mal methodology,  provided  that  it  is  not 
“labor  intensive,”  can  produce  a 10-  to 
15-percent  improvement  in  a software  devel- 
opment program  compared  to  not  following 
a formal  methodology  or  following  an  ad  hoc 
approach  (Reference  1).  Although  adding 
and  subtracting  new  techniques  in  the  form 
of  life-cycle  changes  can  fine-tune  the  meth- 
odology, it  does  not  produce  substantial 
overall  improvements  to  the  program.  To 
achieve  substantial  changes  requires  a 
major  overhaul  of  the  formal  methodology 
itself  or  the  insertion  of  new  technology. 

The  SEL  approach  to  methodology  and 
technology  changes  is  different  from  the  rei  * 
atively  simple  experimentation  performed 
for  life-cycle  changes.  Rather  than  perform- 
ing a single  experiment,  evaluating  the  re- 
sults, and  deciding  to  implement  a new 
technique  across  the  entire  program,  the 
SEL  knew  that  introducing  an  entire 
methodology  or  technology  would  require  a 
more  cautious  approach  because  of  risks  as- 
sociated with  the  immaturity  of  the  method- 
ology or  technology  and  the  extensive 
retraining  of  staff  required.  The  SEL  ap- 
proach is  to  experiment  with  the  new  meth- 
odology or  technology  via  a pilot  project  or 


% 


* 


G.  Heller 
G.  Pa^c 
CSC 
4 of  25 


> 


6209-3A 


a 


projects,  evaluate  the  metrics  collected,  hy- 
pothesize about  the  potential  benefits,  and 
then  repeat  the  experiment  several  times  to 
confirm  or  deny  initial  hypotheses  and  to  es- 
tablish trends. 

In  1984,  the  SEL  began  evaluating  a method- 
ology based  on  the  Ada  language  and 
object-oriented  design.  This  was  a radical 
change  from  the  top-down  structured  design* 
techniques  and  the  FORTRAN  mindset 
then  in  place  in  the  flight  dynamics  environ- 
ment To  evaluate  the  new  methodology,  the 
SEL  began  an  experiment  in  which  the  same 
flight  dynamics  simulator  was  built  in  two 
parallel  development  efforts:  one  in 

FORTRAN  and  the  other  in  Ada.  Known  as 
the  GRODY  experiment  its  results  have 
been  documented  in  a number  of  papers  and 
reports  in  the  SEL  series  (Reference  6). 
Since  this  first  study,  five  more  simulators 
have  been  built  in  Ada,  and  a separate  study 
was  performed  to  transport  one  of  the  simu- 
lators from  a VAX  environment  to  an  IBM 
mainframe  environment  (Reference  7). 
Although  the  trends  on  these  Ada  projects 
are  still  being  analyzed,  a significant  in- 
crease in  reuse,  with  substantial  develop- 
ment cost  savings,  seems  to  be  the  greatest 
benefit 

Another  methodology  change  with  which 
the  SEL  has  begun  to  experiment  recently  is 
the  cleanroom  development  methodology 
(Reference  8).  This  methodology  relies  on 
human  discipline  and  peer  review  tech- 
niques to  eliminate  errors  early  in  the  life 
cycle.  It  isolates  the  designers  and  coders 
from  the  testers  and  prohibits  the  coders 
from  even  compiling  their  programs. 
Although  the  SEL  had  done  some  early 
evaluations  of  this  methodology  (the  code- 
reading technique  already  discussed  was 
adopted  from  the  cleanroom  methodology), 
it  did  not  begin  a cleanroom  pilot  project 
until  1988.  The  ACME  project  used  the 
cleanroom  approach  to  develop  one  of  the 
subsystems  for  an  attitude  ground  support 


system  (AGSS).  Initial  ACME  data  showed 
an  improvement  in  error  rates  (Reference  9). 
Currently,  two  other  projects  are  using  the 
cleanroom  methodology  to  confirm  and  ex- 
pand upon  the  initial  trends  observed  on 
ACME.  One  effort  is  tiying  to  reproduce 
the  trends  on  another  project  of  ACME’S 
scale  (approximately  30  KSLOC  in  size),  and 
the  other  is  tiying  to  scale  up  and  use  the 
methodology  on  an  entire  AGSS  (more  than 
150  KSLOC  in  size)  to  see  if  similar  trends 
appear. 

By  participating  in  SEL  methodology 
change  experiments.  CSC  has  learned  other 
lessons: 

• We  have  been  able  to  minimize 
the  risks  of  inserting  new  technology  into  the 
flight  dynamics  environment  by  measuring 
and  evaluating  impacts  in  a controlled  fash- 
ion, allowing  educated  decisions  to  be  made 
based  on  quantitative  cost/benefit  tradeoffs. 

• In  the  case  of  Ada,  we  have  been 
able  to  take  advantage  of  the  lessons  learned 
on  the  pilot  projects  by  communicating 
them  to  other  organizations  within  our  com- 
pany through  various  technology  exchange 
forums. 

Supporting  the  Organization 

The  third  category  of  activities  in  the  SEL 
process  improvement  program  is  aimed  at 
supporting  the  needs  of  the  development  or- 
ganization rather  than  making  controlled 
changes  to  the  process  or  environment  This 
involves  the  concepts  of  effectively  capturing 
and  packaging  experience. 

Early  in  its  history,  the  SEL  defined  and 
documented  the  methodology  being  used  to 
develop  flight  dynamics  projects.  It  pub- 
lished a series  of  documents  that  established 
standards  and  guidelines  for  both  devel- 
opers and  managers  in  such  areas  as  design, 
implementation,  and  testing  techniques; 
life-cycle  reviews  and  documentation; 
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planning,  monitoring,  and  controlling 
projects;  cost  estimation;  and  product  assur- 
ance (References  10-15).  These  documents 
helped  capture  experience  in  the  flight  dy- 
namics environment  and  were  instrumental 
in  quickly  training  new  staff.  As  technology 
changed  and  the  SELs  domain  grew,  it  be- 
came evident  that  these  documents  had  to 
evolve  as  welL  Thus,  the  SEL  is  currently  up- 
dating this  series  with  the  dual  objectives  of 

(1)  augmenting  the  methodology  to  broaden 
its  scope  and  include  new  technology  and 

(2)  generalizing  it  where  possible  to  provide 
greater  flexibility  for  making  future  changes. 

In  a related  activity,  the  SEL  developed 
process  models  for  the  environment.  A 
process  model  defines  the  expected  behavior 
of  a particular  measure,  such  as  staff  re- 
sources expended,  over  the  life  cycle  of  a 
project.  Process  models  capture  the  experi- 
ence learned  on  past  projects  and  package  it 
in  a form  that  can  be  used  on  current 
projects.  Models  give  greater  visibility  into 
managing  development  projects.  They  allow 
managers  to  make  at-completion  predic- 
tions of  such,  measures  as  resource  utiliza- 
tion, error  rates,  and  project  schedules. 
They  can  also  be  used  to  determine  when  a 
project  is  deviating  from  the  typical  behav- 
ior of  past  projects  and  help  to  determine 
the  causes  of  such  deviations. 

The  SEL  developed  a tool  that  its  managers 
use  to  take  advantage  of  the  SEL  process 
models.  This  tool,  the  Software  Manage- 
ment Environment  (SME)  (Reference  16), 
allows  managers  to  use  process  models  that 
are  based  on  a pool  of  projects  similar  to  the 
ones  they  are  currently  managing.  It  helps 
them  analyze  progress  on  their  projects,  pre- 
dict outcomes,  and  plan  alternatives,  all  with 
the  advantage  of  using  the  experience  base 
built  up  by  the  SEL  in  the  flight  dynamics 
environment 

The  SEL  process  improvement  program  has 
also  helped  recognize  and  respond  to  the 
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changing  needs  of  the  staff  members  in  the 
environment  Over  the  15  years  since  the 
SEL  started,  the  primary  background  of  the 
developers  in  the  environment  has  shifted 
from  mathematics  and  physics  to  computer 
science.  In  response  to  this,  the  SEL  initi- 
ated a training  program  to  give  new  devel- 
opers a basic  foundation  in  flight  dynamics 
applications  and  quickly  familiarize  them 
with  the  SEL  methodology. 

By  participating  in  SEL  activities  to  support 
the  organization,  CSC  has  learned  even 
more: 

• We  need  to  have  a documented 
methodology  used  consistently  across  the 
environment.  Drawing  on  the  SELs  experi- 
ence in  documenting  the  methodology  used 
in  the  flight  dynamics  environment  and  on 
our  own,  more  general  corporate  methodol- 
ogy, we  have  documented  a system  develop- 
ment methodology  for  use  across  the  entire 
SEAS  contract  (Reference  5),  and  we  have 
supplemented  this  methodology  with  a set  of 
standards  and  procedures  (Reference  17)  to 
help  staff  members  apply  it. 

• We  know  that  quantitative  man- 
agement works.  Measuring  process  and 
product  allows  us  to  develop  quantitative 
models  that  enable  projects  to  be  better 
planned,  more  accurately  estimated,  and 
more  effectively  controlled.  We  can  also 
detect  deviations  from  our  plans  more 
easily,  and  hence  we  can  correct  problems 
earlier.  Recognizing  the  importance  of 
quantitative  management,  we  have  pack- 
aged our  experiences  in  this  area  in  a data 
collection,  analysis,  and  reporting  handbook 
to  be  used  by  our  managers  on  the  SEAS 
contract  (Reference  18). 

• We  need  to  train  our  organization 
in  the  methodology  and  in  process  improve- 
ment concepts.  We  have  developed  a re- 
quired training  program  (Reference  19)  for 
all  engineers,  developers,  testers,  integra- 
tors, and  managers  to  ensure  consistent 
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understanding  and  application  of  the  SEAS 
System  Development  Methodology  and  of 
quantitative  management  techniques  across 
the  entire  contract 

• We  can  write  better  proposals 
when  estimates  are  backed  up  with  solid 
data.  From  a business  point  of  view,  being 
able  to  point  to  a quantitative  experience 
base  lends  credibility  to  proposals  and 
brings  in  more  work. 

ASSESSMENT  OF  PROGRESS 

We  have  seen  some  mechanics  of  the  SEL 
process  improvement  program,  some  spe- 
cific examples  of  the  types  of  activities  in 
which  the  SEL  engages,  and  how  CSC  has 
benefited  from  participating  in  these  activi- 
ties. Using  the  SELs  own  data,  we  now 
address  the  question  “Are  we  any  better?” 
by  examining  some  growth,  reliability,  and 
productivity  trends  over  the  past  15  years. 

Three  areas  measure  changes  in  the  nature 
of  flight  dynamics  systems:  complexity,  gen- 
eral requirements,  and  system  size.  CSC 
performed  a study  \u  1988  to  examine  trends 
in  these  areas,  as  well  as  in  software  reuse 
(Reference  20).  At  that  time,  it  was  generally 
felt  that  systems  were  becoming  more 
complex,  primarily  as  a reflection  of  the  in- 
creased complexity  of  the  spacecraft  they 
were  developed  to  support.  Table  Z adapted 
from  the  study,  shows  a comparison  of  typi- 
cal spacecraft  configurations  in  the 
mid-1970s  with  those  of  the  late  1980s.  This 
table  shows  that  the  required  attitude  accu- 
racy is  50  times  greater  than  it  was,  data 
rates  are  over  14  times  faster,  and  there  are 
3 times  as  many  telemetry  data  types  and 
10  times  as  many  sensors.  In  the  above- 
mentioned  study,  these  and  other  character- 
istics were  combined  into  a synthetic 
measure  of  spacecraft  complexity.  A plot 
showing  the  overall  trend  in  this  complexity 
measure  (Figure  1)  shows  that  it  has  more 
than  doubled  over  the  past  15  years. 
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Tabl«  Z Comparison  of  Spacecraft 
Characteristics 


Characteristics 

Mid-1970s 

Late  1380s 

Control 

Spin 

stabilized 

3-oris 

stablized 

Sanson 

1 

a to  1 1 

Torquers 

1 

2to3 

Onboard 

Analog, 

Digtal. 

computer 

simple  control 

autonomous 

Telemetry  types 

5 

12  to  15 

Oat*  rate* 

2J2WXs 

32  Kb/s 

Accuracy 

1 degree 

0.02  degree 

1978  1971  1980  1982  1984  1988  19«  1990 


Figure  1.  Itends  In  Spacecraft 
Complexity 


The  same  study  also  derived  a measure  of 
functional  specification  complexity  to 
reflect  the  growth  in  general  requirements. 
This  measure  also  more  than  doubled  over 
the  past  15  years,  yet  requirements  growth 
was  not  directly  proportional  to  spacecraft 
complexity.  For  example,  going  from  3 
spacecraft  with  one  sensor  to  a spacecraft 
with  five  sensors  means  that  software  must 
be  developed  to  process  data  from  all  five 
sensors.  Beyond  that,  however,  it  may  also 
mean  an  additional  requirement  to  create  a 
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utility  that  determines  the  best  time  to  use 
one  sensor  instead  of  another  or  to  create  a 
program  that  predicts  periods  when  the 
motor  that  runs  one  sensor  might  interfere 
with  the  operation  of  another  sensor.  In 
addition,  requirements  have  been  added  to 
build  programs  that  perform  such  functions 
as  predicting  Earth  occultation  of  a given  set 
of  stars,  predicting  Moon  interference  with 
sensor  operation,  or  predicting  antenna  con- 
tact times.  Thus,  both  increased  spacecraft 
complexity  and  general  requirements 
growth  can  be  seen  as  separate  drivers  in  the 
growth  of  system  size. 

In  terms  of  system  size.  Figure  2 shows  that 
total  number  of  delivered  lines  of  code 
(including  blank  lines  and  comments)  has 
not  quite  tripled.  At  the  same  time,  develop- 
ment error  rates  have  been  reduced  by 
65  percent  (Figure  3).  Figure  4 shows  the 
trend  in  the  cost  per  developed  line  of  code. 
It  has  remained  relatively  constant,  although 
the  narrowing  of  the  maximum  and  mini- 
mum range  lines  indicates  that  it  is  becom- 
ing more  predictable. 


Looking  at  all  of  these  trends  together  now 
helps  us  answer  our  original  question. 
Although  a rigorous  study  of  the  relation- 
ship between  spacecraft  complexity,  re- 
quirements growth,  and  system  size  has  not 
been  performed,  one  could  expect  that  a 
doubling  in  both  complexity  and  general  re- 
quirements might  result  in  a quadrupling  of 
system  size.  Since  system  size  did  not  quite 
triple,  we  conclude  that  developers  are  now 
packaging  more  functionality  per  line  of 
code  than  they  were  15  years  ago.  Thus,  the 
SEL  process  improvement  program  has 
enabled  us  to  build  systems  that  provide 
more  functionality  per  line  of  code,  with  sig- 
nificantly fewer  errors  per  line  of  code,  at  a 
lower  cost  per  line  of  code  than  systems  of 
15  years  ago.  It  is  clearly  possible  to 
improve  productivity  and  lower  error  rates 
at  the  same  time. 


Figure  2.  Ttenda  in  Size  Growth  of  Flight 
Dynamics  Applications 


w 1975  1975  1960  1952  1964  1986  1966  1990 

Figure  3.  Trends  In  Development 
Error  Rates 
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Figure  4.  TVends  In  Software  Cost 


In  addition,  process  models  derived  from 
SEL -collected  data  have  helped  us  predict 
error  rates  and  system  costs  more  accu- 
rately. Thus,  the  answer  to  the  question  “Are 
we  any  better?”  has  to  be  an  unqualified 
“Yes." 
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Impact  of  a 

Process  Improvement  Program  in  a 
Production  Software  Environment: 

Are  we  any  better? 
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Software  Engineering  Laboratory 

Environment 


Characteristic 

Current  State 

Organization  Staff  Level 

>250 

Computing  Environment 

HSD  8063  (IBM  3083), 
VAX  8820,  VAX-11,780 

Languages 

FORTRAN,  Ada 

Application 

Attitude,  orbit,  mission  analysis 

Average  Svstem  Size 

180  KSLOC 

Average  Project  Duration 

2 Years 

Average  Staff  Level 

15  to  20 

Staff  Background 

Computer  Science,  Mathematics, 
and  Physical  Sciences 

Computer  Sciences  Corporation 
System  Sciences  Division 
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What  Is  a 

Process  Improvement  Program? 


A conscious,  continuous  effort  to  build  higher  quality 
systems  by 

• Understanding  the  environment 

• Measuring  and  evaluating  the  results  from  planned 
changes 

• Capturing  and  packaging  experience  to  optimize 

the  process  and  to  anticipate  uncontrollables 
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Life  Cycle  Process  Changes 


• Perform  experimental  studies  on  production  projects 

• Vary  one  element  of  process  and  measure  impacts  on  process  and  product 

• Fine  tune  process  to  take  advantage  of  benefits 


Testing  Studies 

Observations  and  Actions 

Code  Reading, 
Functional  and 
Structural  Testing 

• Code  reading  most  effective  technique 

• Add  to  process 

Independent 
Verification  and 
Validation 

• Small  effects  for  relatively  large  cost 

• IV&V  inappropriate  for  SEL  projects 

Computer  Sciences  Corporation 
System  Sciences  Division 
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Technology/Methodology  Changes 


• Test  new  technology  in  production  environment  with  pilot  project 

• Measure  impacts  on  project  profiles  and  products  produced 

• Package  lessons  learned,  adjust  training,  and  repeat  for  effect 


Technology 

Observations  and  Actions 

Ada 

• Very  promising  trends  on  software  reuse 

• Conduct  further  and  more  detailed  studies 

Cleanroom 

• Initially,  error  levels  very  low 

• Scale  up  experiment  and  verify  findings 

Computer  Sciences  Corporation 
System  Sciences  Division 
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Organizational  Changes 


• Look  for  deviations  from  process  models 

• Determine  impacts 

• Strengthen  definitions  of  overall  approach 


Change 

Action  Taken 

Staff  Turnover  or 
Staff  Growth 

• Created  and  augmented  standards  and  guidelines 

• Developed  Software  Management  Environment  (SME) 

Staff 

Background 

• Established  required  training  program  for  developers 
and  managers 

• Developed  Software  Development  Environment  (SDE) 

Domain  Growth 

• Augmented  methodology  to  broaden  scope 

• Generalized  methodology  to  make  it  more  flexible 

CSC 
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Mid  1970's 


Control:  Spin  Stabilized 
Sensors:  1 
Torquers:  1 

OBC:  Analog 

Simple  Control 
Telemetry:  5 
Data  Rates:  2.2  kb/s 
Accuracy:  1 Degree 


Late1980's 


Control:  3-Axis  Stabilized 
Sensors:  8 to  11 
Torquers:  2 to  3 
OBC:  Digital 

Autonomous  Control 
Telemetry:  12  to  15 
Data  Rates:  32  kb/s 
Accuracy:  0.02  Degree 
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1976  1978  1980  1982  1984  1986  1988  1990 
Complexity  has  more  than  doubled. 

‘D.  Boland,  A Study  on  Size  and  Reuse  Trends  In  Attitude  Ground  Support  Systems  (AGSSs)  Developed  for  the 
Flight  Dynamics  Division  (FDD)  (1976-1988),  CSC/TM-89/6031,  CSC,  February  1989 
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Delivered  Lines  of  Code 
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Development  Error  Rates 


1976  1978  1980  1982  1984  1986  1988  1990 

Error  rates  have  been  reduced  by  65  percent. 
Error  models  are  fairly  well  established. 
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Cost  of  Code 
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Cost  per  LOC  remained  relatively  constant. 
Predictability  is  improving. 
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Code  Reuse  in  Systems 
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Sometimes  interesting  things  in  a picture  are  lost 
because  of  shallow  depth  of  field. 
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Code  Reuse  in  Ada  Simulators 
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However,  searching  with  a reduced  field  of  view 

can  pay  off. 
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What  CSC  Has  Learned 


• Quantitative  management  works 

• Peer  review  works 

• You  can  lower  error  rates 

• You  can  raise  productivity 

• You  can  write  more  credible  proposals 
when  you  can  back  them  up  with  data 
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What  Has  CSC  Done  to 
Capitalize  on  Its  Learning? 

• Developed  a System  Development  Methodology  based  on  its 
experience 

• Packaged  its  experience  with  quantitative  management  In  a 
manager's  Data  Collection,  Analysis,  and  Reporting  Handbook 

• Developed  a set  of  standards  and  guidelines  to  complement 
its  methodology 

• Developed  required  training  programs  for  engineers,  developers, 
testers,  integrators,  and  managers  to  maximize  the  benefits  of  its 
methodology 

• Established  measurement-based  Engineering  Process  Groups  to 
identify  improvement  areas,  recommend  changes,  and  evaluate  the 
impact  of  those  changes 
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TOWARDS  UNDERSTANDING  SOFTWARE- 
15  YEARS  IN  THE  SEL 


Frank  McGarry 
Rose  Pajerski 

GODDARD  SPACE  FLIGHT  CENTER 


ABSTRACT 

For  15  years,  the  Software  Engineering  Laboratory  (SEL)  at  NASA/Goddard 
Space  Flight  Center  (GSFC)  has  been  carrying  out  studies  and  experiments  for 
the  purpose  of  understanding,  assessing,  and  improving  software,  and  soft- 
ware processes  within  a production  software  environment  The  SEL  com- 
prises three  major  organizations: 

• NASA/GSFC  Flight  Dynamics  Division 

• University  of  Maryland  Computer  Science  Department 

• Computer  Sciences  Corporation  Flight  Dynamics  Technology 

Group 

These  organizations  have  jointly  carried  out  several  hundred  software  studies, 
producing  hundreds  of  reports,  papers,  and  documents  [Reference  1]— all 
describing  some  aspect  of  the  software  engineering  technology  that  has  under- 
gone analysis  in  the  Flight  Dynamics  environment.  The  studies  range  from 
small  controlled  experiments  (such  as  analyzing  the  effectiveness  of  code  read- 
ing versus  functional  testing)  to  large,  multiple-project  studies  (such  as  assess- 
ing the  impacts  of  Ada  on  a production  environment).  This  paper  will 
summarize  the  key  findings  that  the  sponsoring  organization  (NASA)  feels 
have  laid  the  foundation  for  ongoing  and  future  software  development  and  re- 
search activities. 


L BACKGROUND 

In  1976,  NASA/GSFC  initiated  an  effort  to  carry  out  experiments  within  the  Flight 
Dynamics  area  to  attempt  to  measure  the  relative  merits  of  the  numerous  software 
technologies  that  were  both  available  and  claimed  to  be  significant  ‘improvements’ 
over  currently  used  practices.  Although  significant  advances  were  being  made  in  de- 
veloping new  technologies,  such  as  structured  development  practices,  automated 
tools,  quality  assurance  approaches,  and  management  tools,  there  was  very  limited 
empirical  evidence  or  guidance  pertaining  to  applying  these  promising,  yet  immature 
methods.  Primarily  to  address  this  situation,  the  Software  Engineering  Laboratory 
(SEL)  was  formed. 
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The  SEL  was  formed  as  a partnership  between  NASA,  the  University  of  Maryland,  and 
Computer  Sciences  Corporation.  This  working  relationship  has  been  maintained 
continually  since  1976  with  relatively  little  change  to  the  overall  goals  of  the  organiza- 
tion during  its  entire  history.  In  general,  the  goals  have  matured;  they  have  not  been 
changed.  The  goals  can  be  itemized  as  follows: 

1.  Understand— Improve  the  insight  that  exists  in  characterizing  the  software 
process  and  its  products  in  a production  environment, 

2.  Assess— Measure  the  impact  that  available  techniques  have  on  the  software 
process.  Determine  which  techniques  are  appropriate  for  the  environment 
and  can  improve  the  software, 

3.  Infuse— After  identifying  process  improvements,  package  the  technology  in 
a form  to  be  applied  and  useful  to  the  production  organization. 

The  approach  taken  to  attain  these  three  generalized  goals  has  been  to  apply  poten- 
tially beneficial  techniques  to  the  development  of  production  software  and  to  measure 
the  process  and  product  in  reasonable  detail  to  assess  quantitiably  the  applied  technol- 
ogy. Measures  of  concern,  such  as  cost,  reliability,  and/or  maintainability,  are  defined 
as  the  organization  determines  the  major  near-  and  long-term  objectives  for  its  soft- 
ware development  process  improvement  program.  Once  those  objectives  are  deter- 
mined, the  SEL  staff  designs  the  experiment,  that  is,  defines  the  particular  data  to  be 
captured  and  the  questions  that  must  be  addressed  in  each  experimental  project. 

All  of  the-experiments  conducted  by  the  SEL  have  occurred.within  the  production-en- 
vironment of  the  Flight  Dynamics  software  development  facility  at  NASA/GSFC. 
This  software  can  be  characterized  as  scientific,  nonembedded,  relatively  complex 
software.  Projects  are  typically  developed  in  FORTRAN,  although  about  25  percent 
of  the  projects  utilize  another  language  such  as  Ada,  C,  or  PASCAL.  The  duration  of 
each  effort  normally  runs  from  2 to  3-1/2  years,  with  an  average  staff  size  of  approxi- 
mately 15  software  developers.  The  average  size  of  one  of  these  projects  runs  approxi- 
mately 175,000  source  lines  of  code  (counting  commentary),  with  about  25  percent 
reused  from  previous  development  efforts.  Since  this  environment  is  relatively  consis- 
tent, it  is  conducive  to  the  experimentation  process.  In  the  SEL,  there  exists  a homoge- 
neous class  of  software,  a stable  development  environment,  and  a very  controlled, 
consistent  management  and  development  process. 

The  following  three  major  functional  organizations  support  the  experimentation  and 
study  within  the  SEL  environment: 

1.  Software  developers,  who  are  responsible  for  producing  the  flight  dynamics 
application  software. 

2.  Software  engineering  analysts,  who  are  the  researchers  responsible  for  car- 
rying out  the  experimentation  process  and  producing  study  results 
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3.  Data  base  support  staff,  who  are  responsible  for  collecting,  checking  and 
archiving  all  of  the  data  and  information  collected  from  the  development 
efforts. 

Since  its  inception  in  1976,  the  SEL  has  carried  out  studies  and  experiments  involving 
nearly  100  flight  dynamics  projects.  Detailed  data  have  been  collected  and  studied, 
and  numerous  reports  and  journal  papers  have  been  produced.  From  all  of  this  analy- 
sis and  from  all  of  these  studies,  seven  key  points  have  been  identified  that  reflect  in- 
sight gained  by  the  SEL  and  which  are  the  guiding  principles  for  future  development 
and  research  within  this  organization.  These  seven  key  points,  described  under 
EXPERIENCES  below,  address  nearly  every  aspect  of  the  activities  within  the  SEL 
experimentation  process  and  should  provide  some  guidance  to  other  organizations  in- 
volved with  software  development  and/or  software  engineering  research. 

II.  EXPERIENCES 

Point  1:  Measurement  is  cut  Essential  Element  of  Software  Process  Improvement 

It  is  imperative  that  software  measurement  be  an  integral  component  of  any  software 
process  assessment  or  process  improvement  program.  Although  this  point  may  seem 
obvious  to  most,  there  is  evidence  that  occasionally  organizations  may  initiate  ‘proc- 
ess improvement’  efforts  without  fully  developing  considerations  and  plans  for  apply- 
ing measurement.  In  addition  to  providing  some  mechanism  for  determining  the 
baseline  characteristics  of  the  software  process  before  any  change  is  adopted,  the 
measurement  aspect  is  necessary  to  gauge  the  impact  of  any  change  to  the  software 
process.  Not  only  does  an  organization  need  to  understand  if  and  by  how  much  soft- 
ware ‘quality’  is  improving  through  enhanced  software  processes,  but  even  the  more 
elementary  assessment  as  to  whether  there  is  any  consistency  within  an  organization’s 
process  (is  it  measurable)  as  well  as  ‘is  change  observable?’  are  points  addressed  only 
through  measurement. 

The  SEL  has  focused  on  software  measurement  as  a tool  to  aid  in  determining  the 
effect  that  changes  to  the  software  process  may  have  on  attributes  of  concern  (cost, 
quality,  reliability, ...).  In  addition  to  this,  it  has  become  evident  that  both  the  meas- 
urement process  as  well  as  the  measurements  themselves  are  extremely  valuable  soft- 
ware management  tools.  The  Flight  Dynamics  environment  has  adopted  the  SEL 
measurement  process  as  an  integral  component  of  the  development  standards  and 
applies  key  measures  in  planning  and  tracking  the  progress  of  projects.  There  are 
eight  key  measures  that  the  Right  Dynamics  software  organization  has  adopted  as 
essential  management  aids  [Reference  2J. 

One  additional  point  that  has  become  apparent  for  the  SEL  after  15  years  of  software 
measurement  is  that  the  adoption  of  an  effective  measurement  program  is  not  cost 
prohibitive.  In  fact,  the  measurement  collection  process  can  be  essentially  a zero  cost 
impact  to  the  development  project— provided  that  a well  thought-out  set  of  measures 
is  adopted  rather  than  an  ill-conceived  large  number  of  measures.  The  most 
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significant  cost  attributable  to  a measurement  program  is  that  of  processing  and  effec- 
tively analyzing/utilizing  the  information— and  this  must  be  done.  It  should  be  ob- 
vious that  the  mere  process  of  collecting  measures  will  be  of  absolutely  no  value  (even 
negative  value)  unless  the  information  is  analyzed  and  factored  back  into  the  software 
process  itself.  Although  the  cost  impact  to  the  development  projects  themselves  can 
be  near  0-percent  overhead,  the  cost  of  processing  and  analyzing  information  as  part 
of  an  effective  process  improvement  program  will  add  10  percent  to  IS  percent  of  the 
development  cost. 

faint  2:  Many  Diversions  Exist  to  a Successful  Process  Improvement  Program 

Most  software  organizations  have  either  attempted  or  at  least  seriously  considered 
adopting  a software  measurement  program.  Unfortunately,  there  are  too  few  exam- 
ples of  projects  or  companies  in  general  that  have  sustained  an  effective  measurement 
program.  Many  reasons  exist  to  explain  why  such  a critical  element  of  software  engi- 
neering consistently  fails,  and  the  SEL  has  experienced  most  of  the  significant  impedi- 
ments and  pit-falls  that  can  discourage  tire  use  of  measurement  programs.  Three  of 
the  most  significant  diversions  that  the  SEL  has  experienced  and  which  seem  to  plavue 
numerous  other  software  organizations  include  die  following: 

1.  Excessive  planning/replanning— If  someone  is  serious  about  starting  a 
measurement  program,  it  is  more  important  to  get  started  with  a very  small  effort  as 
opposed  to  developing  the  full  set  of  measures,  tools,  analysis  approaches,  etc.  The 
key  is  to  start  small  and  grow  with  experience— but  at  least  start. 

2.  Over-Dependence  on  Statistical  Analysis— Although  the  use  of  analysis 
tools  is  certainly  required  in  applying  measurement  to  the  software  process,  there  are 
occasions  when  the  analysts  attempt  to  uncover  more  information  from  available  data 
(measures)  than  is  reasonable.  Intuition  is  an  excelle.-:  starting  point  for  the  analysis 
process,  and  it  is  certainly  enhanced  or  challenged  by  rt  dstical  information:  but  there 
is  danger  in  assuming  that  the  mathematical  interpretation  of  some  quite  inexact  fig- 
ures can  lead  to  a more  accurate  conclusion  than  the  figures  dictate.  Too  often,  the 
common  sense  of  experienced  software  developers  and  managers  is  ignored  in  favor  of 
statistics  produced  with  possibly  flawed,  misinterpreted,  or  missing  data. 

3.  Looking  Under  the  Lamp-Post — As  in  the  case  of  the  person  who  has  lost  a 
coin  in  the  dark  part  of  a street,  but  chooses  to  search  for  it  beneath  the  lighted 
lamp-post  because  it  is  easier  to  see,  software  engineers  occasionally  address  those 
software  topics  that  are  easiest  to  study  as  opposed  to  those  that  are  the  real  problems 
for  software  development/management.  There  has  been  significant  effort  put  into 
studying,  rebuilding,  and  modifying  such  tools  as  code  analyzers,  auditors,  conveners, 
graphical  design  aids,  etc.,  when  there  is  doubt  as  to  the  real  driving  need  for  such 
small  modifications  to  very  old  and  well-understood  technology.  Excessive  studies 
continue  to  be  conducted  on  antiquated  complexity  metrics  and  on  15-year-old 
cost-modeling  techniques,  when  there  are  extremely  difficult  areas  to  be  addressed. 
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such  as  design  measures,  software  specification  tools  and  analyzers,  and  integrated 
environments. 

Point  3:  People  Are  the  Most  Important  Resource /Technology 

In  reviewing  the  results  of  the  numerous  studies  and  experiments  that  the  SEL  has  con- 
ducted over  the  past  15  years,  it  is  apparent  that  the  most  effective  technologies,  that 
result  in  the  most  significant  benefit,  are  those  that  leverage  the  skills  of  the  software 
developers  themselves.  Numerous  studies  outside  of  the  SEL  environment  have 
shown  that  the  productivity  of  individuals  can  easily  vary  by  as  much  as  a factor  of  10 
to  1.  In  addition  to  this  fact,  SEL  studies  have  indicated  that  those  methods  and  tools 
that  emphasize  human  discipline  are  far  more  effective  than  those  that  merely  attempt 
to  take  work  away  from  the  developers. 

Such  software  techniques  as  code  reading,  inspections,  walk-throughs,  and  all  aspects 
of  ‘Cleanroom’  are  examples  that  have  been  shown  to  be  extremely  effective  [Refer- 
ence 3].  All  of  these  are  directed  toward  maximizing  the  potential  of  individuals  as 
opposed  to  removing  the  individual  from  the  process. 

Point  4:  Environmental  Characteristics  Should  Dictate  Selected  Software  Engineer- 
ing Techniques 

Experiences  in  the  SEL  have  verified  the  expectation  that  standards,  methods,  and,  in 
general,  all  software  engineering  approaches  must  be  tailored  to  specific  environ- 
ments. Although  the  point  seems  to  be  obvious,  we  as  practitioners  and  software  engi- 
neers often  attempt  to  apply  a new  technique  or  method  expecting  certain 
improvements  without  first  analyzing  whether  the  methodology  is  addressing  the 
needs  of  the  environment.  For  example,  if  a development  organization  historically 
produces  highly  reliable,  well-tested  software,  then  there  is  probably  little  benefit  to 
be  derived  from  modifying  the  testing  approach  by  applying  an  automated  test  genera- 
tor. 

Additionally,  it  must  be  understood  that  all  software  environments  evolve  with  time 
and  undergo  some  level  of  change.  Because  of  this,  the  overall  process  must  be  contin- 
ually observed  to  identify  changing  and  evolving  practices  in  order  to  respond  with  the 
most  appropriate  modifications  to  methods,  tools,  etc. 

Point  5:  Automation  is  an  Instrument  of  Process  Improvement,  Not  a Replacement 
for  Process  Understanding 

As  was  mentioned  previously,  the  foundation  of  the  process  improvement  paradigm  is 
that  of  understanding  the  software  process  and  associated  products—  which  may  then 
lead  to  assessment  and  to  process  improvement.  Automated  tools  may  provide  some 
help  in  understanding  this  process,  but  too  often  we  expect  the  automation  process  to 
resolve  problems  that  we  don’t  clearly  understand  in  a manual  sense.  If  a software 
developer  or  manager  cannot  clearly  represent  and  grasp  some  process  manually,  the 
application  of  a software  tool  will  only  make  the  process  less  understood  and  more  ill 
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defined.  This  overreliance  on  automation  is  occasionally  exemplified  by  organizations 
that  move  too  swiftly  in  the  adoption  of  CASE  (or  related  technology)  before  the  over- 
all development  characteristics  are  analyzed  and  the  need  for  automated  tools  is  iden- 
tified. Another  example  can  be  seen  in  the  attempts  of  managers  to  use  code 
analyzers,  auditors,  and  automated  complexity  analyzers  to  gain  insight  into  ‘complex- 
ity’ without  being  able  to  discern  this  trait  in  any  of  the  products  or  processes. 

Although  it  is  unwise  to  try  to  automate  immature  processes  or  to  apply  tools  where  no 
tool  is  needed,  there  are  excellent  examples  of  tools  and  overall  automation  that  re- 
flect significant  advances  in  applying  this  technology  to  recently  maturing  disciplines. 
Such  an  example  is  the  recent  development  of  the  ‘Software  Management  Environ- 
ment (SME)’  [Reference  4]  which  is  used  by  the  Flight  Dynamics  Division  at  NASA/ 
GSFC  to  automate  the  use  and  interpretation  of  historical  software  data,  models, 
measures,  and  intuition  toward  the  management  of  active  software  projects. 

Paint  6:  Heritage  of  the  Environment  Will  Strongly  Influence  the  Software  Process 

It  seems  rather  obvious  to  say  that  a development  environment  has  its  own  characteris- 
tics of  process  and  process  improvement  and  that  the  heritage  of  this  environment  will 
certainly  influence  the  development  of  project  after  project,  but  the  level  to  which  the 
past  performance  of  a software  organization  dominates  even  the  use  of  significantly 
different  technology  is  quite  surprising.  It  is  the  most  prevalent  influence  that  the  SEL 
has  seen  in  its  environment  where  evolving,  new  technology  is  continually  applied  to 
observe  impacts  to  the  software  process,  and  major  changes  to  methodology  are  con- 
tinually made. 

For  example,  the  technology  impact  from  the  introduction  of  Ada  into  the  SEL  envi- 
ronment has  been  under  study  since  1985  when  the  first  Ada  system  was  developed. 
One  of  the  early  expectations  was  that  there  would  be  a significant  change  to  the  effort 
distribution  over  the  implementation  (design,  code,  test)  period  for  these  Ada  systems 
in  comparison  with  previous  FORTRAN  systems.  To  date,  this  has  not  been  observed 
in  the  SEL— effort  distributions  based  on  these  activities  have  remained  essentially 
the  same  and  continue  to  reflect  past  SEL  experience.  Since  changes  to  an  established 
development  process  occur  slowly,  the  changes  themselves  tend  to  evolve  over  timie  as 
more  experience  is  gained  with  the  new  technology.  As  expected,  the  use  of  various 
Ada  constructs  (generics,  packages,  typing,  tasking)  in  the  more  recent  Ada  projects  is 
considerably  different  than  in  earlier  systems. 

Paint  7:  Software  Can  Be  Measurably  Improved  Through  Appropriate  Use  of  Aval- 
able  Technologies 

Possibly  the  most  important  point  evinced  as  a result  of  the  15  years  of  study  within  the 
SEL  is  that  software  (both  the  process  and  products)  can  be  quantifiably  improved 
through  the  selected  application  of  methods,  tools,  and  models  that  exist  today.  It  has 
often  been  argued  that  since  ‘the  human  being’  is  the  dominant  factor  in  any  software 
project,  the  modification  or  application  of  any  approach  to  the  development  process 
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cannot  be  observed  nor  can  it  have  any  significant  impact  on  improving  measures  of 
importance. 

Experience  has  verified  the  fact  that  researchers  often  attempt  to  apply  and  measure, 
to  extremeiy  detailed  levels,  techniques  that  may  not  be  ‘measurable’;  however,  it  has 
also  shown  that  overall  trends  are  definitely  measurable  when  the  measurement  proc- 
ess becomes  an  integral  part  of  the  applied  methodology.  As  was  described 
previously,  because  a specific  software  technology  may  not  be  applicable  to  all  envi- 
ronments, each  environment  must  clearly  define  its  goals,  strengths,  and  weaknesses 
before  it  attempts  to  observe  positive  impacts  from  some  modified  approach. 

There  are  specific  methodologies  that  the  SEL  has  applied  and  measured  over  a long 
period  of  time  and  that  have  been  verified  as  having  positive  impact  on  the  cost,  reli- 
ability, and  overall  quality  of  software  within  the  Flight  Dynamics  environment  Such 
techniques  include  ’Reading’  (as  applied  to  design,  code,  and  test),  Ada,  object- 
oriented  development,  design  criteria  (e.g.,  strength),  measurement  and  many  others. 
There  are  software  practices  that  will  significantly  and  measurably  improve  the  soft- 
ware within  any  specific  environment 

in.  OVERALL  COST/IMPACT  OF  THE  MEASUREMENT  EFFORTS  IN  THE  SEL 

For  15  years,  NASA  has  been  funding  these  efforts  to  carry  out  experiments  and  stud- 
ies within  the  SEL.  There  has  been  significant  cost  and  general  overhead  to  this  effort, 
and  a logical  question  that  is  asked  is  'Has  it  all  been  worth  it?’  The  answer  is  a re- 
sounding YES.  Not  only  has  the  expenditure  of  resources  been  a wise  investment  for 
the  Flight  Dynamics  area  within  NASA,  but  members  of  the  SEL  strongly  believe  that 
such  efforts  should  be  commonplace  throughout  the  Agency  as  well  as  throughout  the 
software  community.  The  benefits  far  outweigh  the  cost. 

Since  the  SELs  inception  in  1976,  NASA  has  spent  approximately  S 14  million  dollars 
in  the  three  major  support  areas  required  by  this  type  of  study  environment:  research 
(such  as  defining  studies  and  analyzing  results),  technology  transfer  (such  as  producing 
standards  and  policies),  and  data  processing  (such  as  collecting  forms  and  maintaining 
data  bases).  Additionally , approximately  50  staff-years  of  NASA  personnel  effort  has 
been  expended  on  the  SEL.  During  this  same  time  period,  the  Flight  Dynamics  area 
has  spent  approximately  $130  million  on  building  operational  software,  all  of  which 
has  been  part  of  the  study  process  to  some  degree. 

During  the  past  15  years,  the  SEL  has  certainly  had  significant  impact  on  the  software 
being  developed  in  the  local  environment,  and  there  is  strong  reason  to  believe  that 
many  of  the  results  and  studies  of  the  SEL  have  had  favorable  impact  on  a domain 
broader  than  just  the  NASA  Flight  Dynamics  area.  Examples  of  the  changes  that  have 
been  observed  include  the  following: 

1.  The  ‘manageability’  of  software  has  improved  dramatically.  In  the  late 
1970s  and  early  1980s,  this  environment  experienced  wide  variation  from  project  to 
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project  in  productivity,  reliability,  and  quality.  Today,  however,  the  SEL  has  excellent 
models  of  the  process;  has  well-defined  methods;  and  is  able  to  predict,  control,  and 
manage  the  cost  and  quality  of  the  software  being  produced. 

2.  The  cost  per  line  of  new  code  has  decreased  somewhat  (about  10  percent), 
and  at  first  glance  this  may  imply  that  the  SEL  has  failed  at  improving  productivity. 
Although  the  SEL  finds  that  the  cost  to  produce  a new  source  statement  is  nearly  as 
high  as  it  was  14  years  ago,  there  is  appreciable  improvement  in  the  functionality  of  the 
software,  as  well  as  tremendous  increases  in  the  complexity  of  the  problems  being 
addressed.  Also,  there  has  been  an  appreciable  increase  in  the  reuse  of  software 
(code,  design,  methods,  test  data,  etc.),  which  has  driven  the  overall  cost  of  the  equiva- 
lent functionality  down  significantly.  When  we  merely  measure  the  cost  to  produce 
one  new  source  statement,  the  improvement  is  small;  but  when  we  measure  overall 
cost  and  productivity,  the  improvement  is  significant. 

3.  Reliability  of  the  software  has  improved  by  35  percent.  As  measured  by  the 
number  of  errors  per  thousand  lines  of  code  (E/KSLOC),  the  Flight  Dynamics  soft- 
ware has  improved  from  an  average  of  8.4  E/KSLOC  in  the  early  1980s  to  approxi- 
mately 5.3  E/KSLOC  today.  These  figures  cover  the  software  phases  up  through  and 
including  acceptance  testing  (beginning  of  operations).  Although  the  operational  and 
maintenance  data  are  not  nearly  so  extensive  as  the  development  data,  the  small 
amount  of  data  available  indicates  significant  improvement  in  that  area  as  well. 

4.  Other  measures  include  the  effort  put  forth  in  rework  (changing,  fixing,  etc.) 
and  in  overall  software  reuse.  These  measures  also  indicate  a significant  improvement 
to  the  software  within  this  one  environment. 

In  addition  to  the  common  measures  of  software  (cost,  reliability,  etc.),  there  are  many 
other  major  benefits  derived  from  such  a ‘measurement’  program  as  that  in  the  SEL- 
Not  only  has  our  understanding  of  software  significandy  improved  within  the  research 
community,  but  this  understanding  is  apparent  throughout  the  entire  development 
community  within  this  Flight  Dynamics  environment.  Not  only  have  the  researchers 
benefited,  but  it  is  obvious  that  the  developers  and  managers  who  have  been  exposed 
to  this  effort  are  much  better  prepared  to  plan,  control,  assure,  and,  in  general, 
develop  much  higher  quality  systems.  One  view  of  this  entire  program  is  that  it  is  a 
major  ‘training’  exercise  within  a large  production  environment,  and  the  800  to 
1000  developers  and  managers  who  have  participated  in  development  efforts  studied 
by  the  SEL  are  much  better  trained  and  effective  software  engineers. 
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SEL  ENVIRONMENT 


DEVELOPERS  S/W  ANALYSTS 

(DEVELOP  FLIGHT  DYNAMICS  S/W) -D.ieMT  (STUDY  PROCESS) 


STAFF  5-10  RESEARCHERS 

FUNCTION  • SET  GOALS/QUESTIONS/ 
METRICS 

- DESIGN  STUDIES/ 
EXPERIMENTS 

• ANALYSIS/RESEARCH 

• REFINE  S/W  PROCESS 

- PRODUCE  REPORTS/ 
FINDINGS 

1976-1990  - 250  REPORTS/DOCUMENTS 


DATA  BASE  SUPPORT  (MAINTAIN/QA  SEL  DATA) 

STAFF  2-5  (FTE) 

FUNCTION  • PROCESS  FORMS/DATA 

• QA 

• RECORD/ARCHIVE  DATA 

• MAINTAIN  SEL  DATA  BASE 

• OPERATE  SEL  LIBRARY 

1976-1990 


SEL  DATA  BASE 


0 

| FORMS  UBRARY  (^jj 

J REPORTS  UBRARY  |^| 
• OVER  150,000  ‘FORMS’ 


STAFF 

150-250  (FTE) 

TYPICAL  PROJECT 
SIZE 

- 150-200  KSLOC 

ACTIVE  PROJECTS 

-6-10 

(AT  ANY  GIVEN  TIME) 

PROJECT  STAFF 
SIZE 

- 15-25  PEOPLE 

1976-1990 

- 75  PROJECTS 

MEASURES 
FOR  EACH 
PROJECT 


REFINEMENTS 

TO 

DEVELOPMENT 

PROCESS 
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MEASUREMENT  IS  AN  ESSENTIAL  ELEMENT 
OF  S/W  PROCESS  IMPROVEMENT 


• MEASURES  DEFINE  PROCESS/PRODUCT  BASELINE  AND  GAUGE  CHANGE 

- ONLY  MEANS  OF  PROVIDING  UNDERSTANDING 

- WITHOUT  MEASUREMENT  CANNOT  DETERMINE  CHANGE/IMPROVEMENT 

• MEASURES  - SIGNIFICANT  ASSET  TO  S/W  MANAGEMENT/DEVELOPMENT 

- VITAL  FOR  PLANNING/ESTIMATING 

- PROVIDES  INSIGHT  TO  HEALTH  OF  PROJECTS 

• MEASUREMENT  IS  NOT  COST  PROHIBITIVE 

- EXISTS  SMALL/CRITICAL  SET  OF  MEASURES 

- CRITICAL  SET  LESS  THAN  2%  IMPACT  TO  PROJECT 

- BENEFITS  FAR  OUTWEIGH  THE  OVERHEAD 
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MEASURES  - GAUGING  CHANGE  AND  IMPROVEMENT  IN  THE  SEL 


PERCENTAGE  OF  PROJECT  SCHEDULE  COMPLETED 
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• MEASUREMENT  AS  A MANAGEMENT  AID 


TRACKING  “COBE”  RELIABILITY 


CODE/TEST  SYSTEM  TEST  ACCEPTANCE  TEST  OPERATIONS 

If*  I , 

ig}  MEASURING  ERROR  RATES  CAN  PROVIDE  EARLY 

s INDICATION  OF  SOFTWARE  QUALITY 
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PEOPLE  ARE  MOST  IMPORTANT 
RESOURCE/TECHNOLOGY 


® 

TEST  TECHNIQUES  EXPERIMENT  DESCRIPTION 


• 3 APPROACHES  STUDIED 

- CODE  READING 

- FUNCTIONAL  TESTING 

- STRUCTURAL  TESTING 


32  PEOPLE  PARTICIPATED 
(GSFC,  UM,  CSC) 

3 UNIT-SIZED  (100  SLOC) 
PROGRAMS  SEEDED  WITH  ERRORS 
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% OF  FAULTS  DETECTED 


61 

51 

38 

CODE  FU 

NOTIONAL  STRUCTURAL 

READING  TESTING  TESTING 


NUMBER  OF  FAULTS  DETECTED 
PER  HOUR  OF  EFFORT 


EFFECTIVE  TECHNOLOGY  SHOULD  FOCUS  ON 
“PERSONNEL”  POTENTIAL 
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MANY  DIVERSIONS  EXIST  TO  A SUCCESSFUL 
PROCESS  IMPROVEMENT  PROGRAM 


© 

(DIVERSIONS  THE  SEL  HAS  BEEN  THROUGH) 

• EXCESSIVE  PLANNING/REPLANNING 

- JUST  DO  IT/START  SMALL 

- LEARN  WITH  EXPERIENCE 

- RELY  ON  LOCAL  STANDARDS  (E.G.,  TERMINOLOGY) 

• OVER  DEPENDENCE  ON  STATISTICAL  ANALYSIS 

- INTUITION  IS  A VERY  USEFUL  STARTING  POINT 

- MAKE  USE  OF  SUBJECTIVE  DATA 

• LOOKING  UNDER  THE  LAMP  POST 

- CODE  ANALYZERS/CONVERTERS 

- COMPLEXITY  METRICS 

- DESIGN  GRAPHIC  AIDS 


ENVIRONMENTAL  CHARACTERISTICS  SHOULD  DICTATE 
SELECTED  SOFTWARE  ENGINEERING  TECHNIQUES 


• SPECIFIC  MEASURES/TECHNIQUES  MAY  NOT  APPLY 
TO  ALL  “DOMAINS” 


• AS  ENVIRONMENT  EVOLVES,  METHODOLOGIES 
SHOULD  FOLLOW  (AND  LEAD) 
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TAILOR  STANDARDS/POLICIES 
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SPECIFIC  MEASURES  MAY  NOT 
APPLY  TO  ALL  “DOMAINS’ 


SOFTWARE  MEASURES  IN  THE  SEL 
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45  75  105  135  165  195 

McCABE  COMPLEXITY 


100 


300  500  700  900  1100  1300 
LINES  OF  CODE 

CORRELATIONS 


as 


TOTAL 

EXECUTABLE 

McCABE 

HALSTEAD 

LINES 

LINES 

COMM  FXI7Y 

LENGTH 

HALSTEAD  LENGTH 

0.85 

0.91 

O.'Jl 

1.00 

McCABE  COMPLEXITY 

0.81 

0.87 

1.00 

EXECUTABLE  LINES 

0.84 

1.00 

TOTAL  LINES 

1.00 
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SAMPLE  OF  688  MODULES 
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CHARACTERISTICS  OF  EFFECTIVE  POLICIES 


STANDARDS/POLICIES  MUST  BE: 


1.  WRITTEN 

2.  UNDERSTOOD 

3.  “LEGACY-BASED” 

4.  ENFORCED 


• MAY  BE  COMBINED  (GENERIC  AND  TAILORED) 

• CAREFULLY  “PRESENTED” 

• NOT  TO  INCLUDE  EXCESSIVELY  ALIEN 

TECHNOLOGY  

• TRAINING  OFTEN  REQUIRED 

• DERIVED  FROM  NEED/LEGACY 

• CONTINUALLY  EVOLVING 

• ALL  ELEMENTS  ARE  "DEPENDABLE" 

• SUPPORTED  BY  MANAGEMENT 

• LIMITED  “DETAIL" 


5.  MEASURABLE 


• OBSERVABLE  (CAN  TELL  IF  IT’S 
BEING  FOLLOWED) 

• REQUIRES  SELF-EVALUATION 


AUTOMATION  IS  AN  INSTRUMENT  OF  PROCESS  IMPROVEMENT 
(NOT  A REPLACEMENT  FOR  PROCESS  UNDERSTANDING) 
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• TOOLS  CAN  PROVIDE  SIGNIFICANT  BENEFIT  TO 
WELL-DEFINED  EXPERIENCE  BASE  (E.G.,  SME  IN  THE  SEL) 

• “IMMATURE”  PROCESSES  ARE  NOT  AUTOMATABLE 

(IF  YOU  CAN’T  DO  IT  MANUALLY  - DONT  TRY  TO  AUTOMATE  IT) 
(E.G.,  OVER  RELIANCE  ON  CASE/ANALYZERS/AUDITORS/ 
MEASUREMENT  TOOLS) 

• EFFECTIVE  TOOLS  MUST  ADDRESS  DEFINED  PROCESS  NEED 
(MATCH  SOLUTION  TO  PROBLEM) 

(E.G.,  OVERUSE  OF  CODE  TRANSLATOR/CODE  ANALYZERS/ 
TEST  GENERATORS,  ...) 
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AUTOMATING  A WELL-UNDERSTOOD  “EXPERIENCE  BASE”  IN  THE  SEL 
(SOFTWARE  MANAGEMENT  ENVIRONMENT  (SME)) 


r 


HERITAGE  OF  ENVIRONMENT  WILL  STRONGLY 

INFLUENCE  PROCESS 
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FORTRAN  Ada 


BY  ACTIVITY 


Muuum  ‘BASED  ON  6 Ada  AND  B FORTRAN  PROJECTS  OF  SIMILAR  TYPE  IN  THE  SEL 


SIGNIFICANT  PROCESS  CHANGE  REQUIRES 
SIGNIFICANT  EFFORT/TIME 


O * on 

to 

(D 


2^0.6 
O V 

< * 0.4 
a. 


85/86 


85/86 


USE  OF  Ada  FEATURES 


GENERICS 


87/88 

PACKAGES 


87/88 


88/89 


88/89 


£§=> 

jig* 

P i 


STRONG  TYPE 


2.65 

1.88 

85/86 


1st  Ada 
85/86 


87/88 

TASKING 


87/88 


88/89 


2nd  Ada  3rd  Ada 


88/89 
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• USE  OF  Ada  FEATURES  CHANGES  APPRECIABLY  WITH  EXPERIENCE 

• NOT  ALL  FEATURES  APPROPRIATE  FOR  APPLICATION 
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SOFTWARE  CAN  BE  MEASURABLY  IMPROVED  THROUGH 
APPROPRIATE  USE  OF  AVAILABLE  TECHNOLOGIES 


© 

EXAMPLES  IN  ONE  ENVIRONMENT  (SEL) 
TECHNOLOGY  DEMONSTRATED  IMPACT 


- "READING” 

- DESIGN  CRITERIA 
(STRENGTH) 


REPEATEDLY  SHOWN  TO  IMPROVE  SOFTWARE 
RELIABILITY  (NO  ADDITIONAL  COST) 

DEMONSTRATED  TO  PRODUCE  MORE  ERROR 
FREE  SOFTWARE 


- Ada  SIGNIFICANT  COST  BENEFIT  THROUGH  REUSE 

- OOD  REUSE 


- CLEAN  ROOM  SIGNIFICANT  IMPROVEMENT  IN  RELIABILITY 

AND  PRODUCTIVITY  (ALSO  RESOURCE 
CONSUMPTION  DOWN) 


- MANAGEMENT/  MAJOR  IMPROVEMENT  IN  PLANNING,  ADJUSTING 

MEASUREMENT  AND  CONTROL 

- COST  ESTIMATION 

- SCHEDULE  ESTIMATION 
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ASSESSING  “STRENGTH”  AND  “SIZE”  AS  A 
STANDARD  FOR  DESIGN 

0 

EXPERIMENT: 

• 450  FORTRAN  MODULES  (ACROSS  4 SYSTEMS  - OVER  20  DEVELOPERS) 

• DETAILED  COST  AND  ERROR  DATA  ON  ALL  MODULES 

• DETERMINE  RELATIONSHIPS:  “STRENGTH”  TO  RELIABILITY  AND 
“SIZE”  TO  RELIABILITY 

RESULTS: 


FAULT  RATE  FOR  CLASSES  OF  MODULE  STRENGTH 


HIGH  STRENGTH  MEDIUM  STRENGTH  LOW  STRENGTH 
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DESIGN  MEASURES  SUMMARY 


• GOOD  PROGRAMMERS  TEND  TO  WRITE 
HIGH-STRENGTH  MODULES 

• GOOD  PROGRAMMERS  SHOW  NO  PREFERENCE 
FOR  ANY  SPEi  ; :C  MODULE  SIZE 

• OVERALL,  HIGH-STRENGTH  MODULES  HAVE 
A LOWER  FAULT  RATE  AND  COST  LESS 
THAN  LOW-STRENGTH  MODULES 

• OVERALL,  LARGE  MODULES  COST  LESS  (PER 
EXECUTABLE  STATEMENT)  THAN  SMALL  MODULES 

• FAULT  RATE  IS  NOT  DIRECTLY  RELATED  TO  MODULE  SIZE 
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Ada  (AND  OOD)*  IMPACTS  ON  “COST' 
FROM  SEL  EXPERIENCES 


■ TOTAL  REUSE 
□ VERBATIM  REUSE 


COST  PER  LINE  OF  CODE 


FORTRAN  Ada 

(5  PROJECTS)  (85/86) 


Ada 

(87/08) 


Ada 

(89/90)* 


(•PARTIALLY  BASED  ON  ESTIMATES) 


80 

LU 

C /) 

3 60 

cc 
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o 

H 20 
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3 60 

EC 
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5 PROJECTS  USING  FORTRAN 
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(66/87)  (87/68)  (88/88)  (88/88)  (88/80) 

5 PROJECTS  USING  ADA  AND  OOD 

96%|B9jr 
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OROOY  QOESIU  GOADA  UARSTELS  EUVEDSW  EUVETELS 
(66/87)  (67/86)  (66/89)  (68/89)  (68/90)  (66/90) 

1.  DEVELOPMENT  COST  PER  STATEMENT  HAS  BEEN  NO  ‘•CHEAPER-  FOR  ADA 

2.  REUSE  POTENTIAL  OF  Ada  IS  SIGNIFICANT 

•ALL  Ada  PROJECTS  APPLIED  OOD  TECHNIQUES 
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HAS  THE  EFFORT  BEEN  WORTH  IT? 

(1975  - 1990) 


• SEL  EXPENDITURES  (1990  DOLLARS) 

- RESEARCH  SUPPORT  (UNIVERSITY) 

(EXPERIMENTATION,  ANALYSIS,  RESEARCH,  REPORTS,  ...) 

- RESEARCH  AND  TECH  TRANSFER  (CSC) 

(ANALYSIS,  RESEARCH,  REPORTS,  OVERHEAD  TO 
DEVELOPMENT  PROJECTS) 

- DATA  PROCESSING  AND  GENERAL  SUPPORT  (CSC  AND  OTHERS) 
(PROCESS/QA  DATA,  SEL  DATA  BASE,  REPORTS,  ...) 

• PRODUCTION  SOFTWARE  (FLIGHT  DYNAMICS)  DEVELOPED 
(CSC  AND  NASA) 
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$5.5M 

$6.0M 

$130M 
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HAS  THE  EFFORT  BEEN  WORTH  IT? 

(1975  - 1990) 

IMPACT  OF  SEL  RESEARCH* 

1976  - 1980  1986  - 1990 

• PROCESS-MODELED 
AND  EFFECTIVE 

• SOFTWARE  MORE 
PREDICTABLE,  CONSISTENT 

• RATIONALE  FOR  METHODS 
USED  EXISTS 


COST  PER  LINE 
OF  CODE 

« 24  SLOC/DAY 

^ 24  SLOC/DAY 

RELIABILITY 

8.4  E/KSLOC 

5.3  E/KSLOC 

(UNIT  TEST  THRU 
ACCEPTANCE) 

CODE  REUSE 

15-25% 

25-35% 

REWORK 

35-40%  OF  TOTAL  EFFORT 

20-30% 

•PROBLEM  COMPLEXITY  AND  SUPPORT  ENVIRONMENT  HAVE  ALSO  CHANGED  SIGNIFICANTLY 
A498.027 


MANAGEABILITY  • COMPLETE  DEPENDENCE 

ON  PERSONNEL  CAPABILITY 

• WIDE  VARIANCE  IN 
COST/QUALITY 

• NO  GUIDANCE  FOR 
SELECTING  METHODS 
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HAS  THE  EFFORT  BEEN  WORTH  IT? 


FINAL  OBSERVATIONS 

• “OUR”  UNDERSTANDING  OF  SOFTWARE  HAS  IMPROVED 
SIGNIFICANTLY  (WE  DO  SOFTWARE  BETTER) 


• CONTRIBUTIONS  TO  SOFTWARE  RESEARCH  AND  DEVELOPMENT 
(MEASUREMENT,  MANAGEMENT,  EXPERIENCE  BASE,  ...) 

• PROFESSIONAL  DEVELOPMENT  OF  DEVELOPERS,  MANAGERS, 
RESEARCHERS 


• “NEW"  AWARENESS  BY  MANAGERS,  DEVELOPERS 
(SOFTWARE  CAN  BE  ENGINEERED) 


A49U.026 


4 


A 


ONGOING/FUTURE  ACTIVITIES  FOR  THE  SEL 


GENERAL 

• CONTINUE  EVALUATION  OF 
“PROCESS  IMPROVEMENT” 

- G/Q/M 

- EXPERIMENTATION 

- REFINEMENT 

• DOMAIN  ANALYSIS  FOR 
“EXPERIENCE  BASE” 

- RELEVANCE  TO  OTHER 
“ENVIRONMENTS” 

- CHARACTERIZING  DOMAIN 
OF  "EXPERIENCE  BASE” 

• EXPANSION  OF  LIFE 
CYCLE  ANALYZED 

- MAINTENANCE 

- SPECS/REQUIREMENTS 

- EXPANDED  MEASUREMENT 


CURRENT/NEAR  FUTURE  STUDIES 

• CLEAN  ROOM  (3  ACTIVE  PROJECTS) 

• Ada  (3  PROJECTS) 

• OOD  (1  ONGOING  EXPERIMENT  - 2 PLANNED) 

• CASE  (1  ACTIVE  PROJECT) 

• REUSE  (USING  EXISTING  SEL  PROJECTS) 

• MAINTENANCE  (3  PROJECTS  FOR  ANALYSIS) 

• TESTING  STRATEGIES  (EXISTING  SEL  PROJECTS) 

• MEASUREMENT  (CHARACTERIZING  DESIGNS) 
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ACTIVITIES 
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STUDIES  IN  THE  SEL 


1976  - 1990 


ASSESSING 


PACKAGING 

• TRAINING  PROGRAM 
• SME 

“MANAGER'S  HANDBOOK” 


UNDERSTANDING 


• CLEAN  ROOM 
• EVALUATE  ADA 

• ASSESS  STRENGTH  AS  DESIGN  CRITERIA 
COMPARE  TEST  TECHNIQUES  (FUNCTIONAL.  READING.  STRUCTURAL) 


(/) 

y 
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• RELATIONSHIP  BETWEEN  DEVELOPMENT  MEASURES 
• ERROR/CHANGE  CHARACTERISTICS 
• RESOURCE  AND  EFFORT  PROFILES 
APPROACH  TO  DATA  COLLECTION 


1976  - 1980 


1980  - 1986 


1986  - 1990 


• DEFINE  PROCESS  • INITIAL  “RELATIONSHIPS'  • PROCESS  IMPROVEMENT  ENVIRONMENT 

• CALIBRATE  “PROCESS  ENVIRONMENT"  • EXPERIMENTS  • FULL  TECHNOLOGY  ASSESSMENT 

• DEFINE  MEASURES/MEASUREMENT  • REFINE  MEASURES/MEASUREMENT  • FULL  USE  OF  MEASUREMENT 
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EVOLVING  TO  AN  EFFECTIVE 
“PROCESS  IMPROVEMENT’  ENVIRONMENT 


SESSION  2- PROCESS  IMPROVEMENT 


R.  E.  Nance,  VPI 

M.  B.  Arend,  McDonnell  Douglas 
R.  Lydon,  Raytheon 
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Software  Development  Methodologies* 
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Index  Terms:  Software  Development  Methodologies,  Procedural  Evaluation,  Evaluating  Method- 

ologies, Software  Engineering  Objectives,  Software  Engineering  Principles,  Induced 
Attributes,  Indicators. 


1.  Introduction 


Over  the  past  two  decades  the  software  development  process  has  changed  dramatically.  Early 
software  development  practices  were  guided  by  useat-of-the-pants”  programming  styles.  Recog- 
nizing maintenance  difficulties  associated  with  such  styles,  the  software  development  community 
began  to  investigate  and  identify  software  engineering  principles  that  could  significantly  enhance 
the  maintainability  and  quality  of  a resulting  product.  Consequently,  development  techniques  that 
exploited  software  engineering  principles  like  abstraction  [LISB75],  information  hiding  [PARD72] 
and  stepwise  refinement  [WIRN71]  were  formulated  and  integrated  into  many  software  development 
processes. 

The  subsequent  demand  for  increasingly  complex  software  systems,  however,  mandated  the 
coordinated  use  of  complementary  principles,  guided  by  an  encompassing  software  development 
philosophy  that  recognized  project  level  goals  and  objectives,  i.e.  a methodological  approach  to 
software  development.  Today,  a myriad  of  tools,  techniques,  and  development  methodologies  ad- 
dress the  challenging  task  of  producing  high  quality  software.  For  example.  SCR  [HENK78]  and 
* Work  supported  by  the  U.S.  Navy  through  the  Systems  Research  Center  under  Basic  Ordering 
Agreement  N60921-83-G-A165  B003-4. 
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DARTS  [GOMH84]  are  development  methodologies  that  emphasize  specific  software  engineering 
goals  (reducing  software  development  costs  and  facilitating  the  design  of  real-time  systems,  respec- 
tively). SREM  [ALFM85]  and  SADT  [ROSD77]  are  methodology  based  environments . Both  focus 
on  particular  phases  of  the  software  life  cycle  and  are  supported  by  unified  sets  of  complementary 
tools. 

The  steady  proliferation  of  design  methodologies,  however,  has  not  been  without  a price.  In 
particular,  users  find  increasing  difficulty  in  choosing  an  appropriate  methodological  approach  and 
recognizing  reasonable  expectations  of  a design  or  development  methodology.  Addressing  this 
concern,  the  research  described  in  this  paper  outlines  a well-defined  procedure  for 

• evaluating  the  adequacy  of  a software  development  methodology  relative  to  project  goals, 
and 

• assessing  the  effectiveness  of  a methodology  relative  to  the  quality  of  the  product  produced. 

The  evaluation  procedure  is  based  on  a substantiated  set  of  linkages  among  accepted  software 
engineering  objectives,  principles,  and  attributes.  These  linkages  reflect  an  assessment  perspective 
structured  by  the  needs,  process,  and  product  sequence  for  system  development,  and  enable  a 
comparative  scale  for  determining  the  adequacy  and  effectiveness  of  the  supporting  development 
methodology.  The  identification  of  code  and  documentation  properties  and  the  definition  of  metrics 
for  these  properties  enables  an  accumulative  determination  of  software  engineering  attributes, 
principles  and  objectives. 

To  provide  a uniform  basis  for  discussion,  Section  2 outlines  the  role  of  a methodology  in 
the  software  development  process.  Section  3 discusses  the  relationship  of  software  engineering 
objectives,  principles  and  attributes  to  the  software  development  effort.  Section  4 identifies  the 
commonly  accepted  objectives,  principles  and  attributes,  defines  the  relationships  among  them, 
and  then  discusses  how  one  evaluates  a methodology  based  on  the  those  relationships.  Finally. 
Section  5 describes  an  application  of  the  assessment  procedure  to  two  Navy  software  development 
methodologies. 
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2.  What  Constitutes  a Methodology 

Fundamental  to  the  research  presented  in  this  paper  is  a common  understanding  of  what 
constitutes  a “methodology”.  Simply  stated,  a methodology  is  a collection  of  complementary 
methods,  and  a set  of  rules  for  applying  them  [FREP77],  More  specifically,  a methodology 

(1)  organizes  and  structures  the  tasks  comprising  the  effort  to  achieve  a global  objective, 
establishing  the  relationships  among  tasks, 

(2)  defines  methods  for  accomplishing  individual  tasks  (within  the  framework  of  the  global 
objective),  and 

(3)  prescribes  an  order  in  which  certain  classes  of  decisions  are  made,  and  ways  of  making 
those  decisions  that  lead  to  the  overall  desired  objective. 

In  general,  software  development  methodologies  should  be  guided  by  accepted  software  engineering 
principles  that,  when  applied  to  the  defined  process,  achieve  a desired  goal.  Based  on  this  common 
understanding  of  what  constitutes  a methodology,  the  following  sections  present  a procedural 
approach  to  evaluating  the  adequacy  and  effectiveness  of  software  development  methodologies. 

3.  The  Role  of  Objectives,  Principles,  and  Attributes  in  Software  Development 

The  development  of  large,  complex  software  systems  is  considered  a project  activity,  involving 
several  analysts  and  programmers  and  at  least  one  manager.  What  then  is  the  role  of  a methodology 
in  this  setting  and  how  does  it  relate  to  objectives,  principles  and  attributes?  Figure  1 assists  in 
providing  an  answer  to  this  question. 

In  general  terms,  an  objective  is  “something  aimed  at  or  striven  for.”  More  specific  to  the 
software  development  context,  an  objective  pertains  to  a project  desirable  - a characteristic  that 
can  be  definitively  judged  only  at  the  comple.  o\  of  the  project. 
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Figure  1 

Illustration  of  the  Relationships  Among  Objectives,  Principles,  Attributes 
in  the  Software  Development  Process 


A software  engineering  principle  describes  an  aspect  of  how  the  process  of  software  development 
should  be  done.  The  process  of  software  development,  if  it  is  to  achieve  the  stipulated  objectives, 
must  be  governed  by  these  “rules  of  right  conduct.” 

Attributes  are  the  intangible  characteristics  of  the  product:  the  software  produced  by  project 
personnel  following  the  principles  set  forth  by  the  methodology.  Unlike  objectives,  which  pertain 
only  to  the  total  project  activity,  attributes  may  be  observed  in  one  unit  of  the  product  ami 
absent  in  another.  The  claim  of  presence  or  absence  of  an  attribute  is  based  on  the  recognition  of 
properties , which  contribute  evidence  supporting  the  claim.  Properties  are  observable,  and  can  be 
subjective  as  well  as  objective  in  nature. 
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Influenced  by  Fritz  Bauer’s  original  definition  of  software  engineering  [BAUF72]  and  reflecting 
the  above  description  of  software  engineering  objectives,  principles  and  attributes,  the  rationale  for 
the  evaluation  procedure  described  in  this  paper  is  founded  on  the  philosophical  argument  that: 

The  raison  d’etre  of  any  software  development  methodology  is  the  achievement  of  one 
or  more  objectives  through  a process  governed  by  defined  principles . In  turn,  adherence 
to  a process  governed  by  those  principles  should  result  in  a product  (programs  and 
documentation)  that  possesses  attributes  considered  desirable  and  beneficial. 

This  philosophy,  exemplified  by  Figure  1,  is  tempered  by  practical  concerns: 

(1)  While  a set  of  software  engineering  objectives  can  be  identified,  this  set  might  not  be 
complete,  and  additions  or  modification  should  be  permitted. 

(2)  Objectives  can  be  given  different  emphasis  within  a methodology  or  in  applications  of  a 
methodology. 

(3)  Attributes  of  a large  software  product  might  be  evident  in  one  component  yet  missing  in 
another. 

4.  A Framework  for  Evaluating  Software  Development  Methodologies 

A broad  review  of  software  engineering  literature  [BERG81,  CHML90,  GAFJ78,  JACM75, 
PARD79,  PARD72,  SCOL78,  WARJ76]  leads  to  the  identification  of  seven  objectives  commonly 
recognized  in  the  numerous  methodologies: 

(1)  Maintainability  - the  ease  with  which  corrections  can  be  made  to  respond  to  recognized 
inadequacies. 

(2)  Correctness  - strict  adherence  to  specified  requirements. 

(3)  Reusability  - the  use  of  developed  software  in  other  applications, 

(4)  Testability  - the  ability  to  evaluate  conformance  with  requirements, 

(5)  Reliability  - the  error-free  performance  of  software  over  time, 

(6)  Portability  - the  ease  in  transferring  software  from  one  host  system  to  another,  and 
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(7)  Adaptability  - the  ease  with  which  software  can  accommodate  to  change. 

The  authors  note  that  these  definitions,  as  well  as  others  presented  in  this  section,  are  abridged: 
they  are  primarily  intended  to  reflect  a working  understanding  based  on  general  literature  usage. 

Achievement  of  these  objectives  comes  through  the  application  of  principles  supported  (en- 
couraged, enforced)  by  a methodology.  The  principles  enumerated  below  are  extracted  from  the 
references  cited  above  (and  others)  as  mandatory  in  the  creative  process  producing  high  quality 
programs  and  documentation. 

(1)  Abstraction  - defining  each  program  segment  at  a given  level  of  refinement. 

(а)  Hierarchical  Decomposition  - components  defined  in  a top-down  manner. 

(б)  Functional  Decomposition  - components  partitioned  along  functional  boundaries. 

(2)  Information  Hiding  - insulating  the  internal  details  of  component  behavior. 

(3)  Stepwise  Refinement  - utilizing  a convergent  design. 

(4)  Structured  Programming  - using  a restricted  set  of  control  constructs. 

(5)  Concurrent  Documentation  - management  of  supporting  documents  (system  specifica- 
tions, user  manual,  etc.)  throughout  the  life  cycle. 

(6)  Life  Cycle  Verification  - verification  of  requirements  throughout  the  design,  development, 
and  maintenance  phases  of  the  life  cycle. 

Employment  of  well- recognized  principles  should  result  in  software  products  possessing  at- 
tributes considered  to  be  desirable  and  beneficial.  A short  definition  of  each  attribute  is  given 
below. 

(1)  Cohesion  - the  binding  of  statements  within  a software  component. 
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(2)  Coupling  - the  interdependence  among  software  components. 

(3)  Complexity  - an  abstract  measure  of  work  associated  with  a software  component  relative 
to  human  understanding  and/or  machine  execution. 

(4)  Weil-defined  Interfaces  - the  definitional  clarity  and  completeness  of  a shared  boundary 
between  a pair  of  components  (hardware  or  software). 

(5)  Readability  - the  difficulty  in  understanding  a software  component  (related  to  complexity). 

(6)  Ease  of  Change  - the  ease  with  which  software  accommodates  enhancements  or  extensions. 

(7)  Traceability  - the  ease  in  retracing  the  complete  history  of  a software  component  from  its 
current  status  to  its  design  inception. 

(8)  Visibility  of*  Behavior  - the  provision  of  a review  process  for  error  checking. 

(9)  Early  Error  Detection  - indication  of  faults  in  requirements  specification  and  design  prior 
to  implementation. 

The  software  development  process,  illustrated  in  Figure  1,  depicts  a natural  relationship  that 
links  objectives  to  principles  and  principles  to  attributes.  That  is,  one  achieves  the  objectives  of 
a software  development  methodology  by  applying  fundamental  principles  which,  in  turn,  induce 
particular  code  and  documentation  attributes.  From  a more  detailed  perspective.  Figure  2 defines 
the  specific  set  of  linkages  relating  objectives  to  principles  and  principles  to  attributes.  As  described 
below,  these  linkages  provide  a framework  for  assessing  both  the  adequacy  of  a methodology  as 
well  as  its  effectiveness . 

4.1  Assessing  the  Adequacy  of  a Methodology 

The  enunciation  of  objectives  should  be  the  first  step  in  the  definition  of  a software  development 
methodology.  Closely  following  is  the  statement  of  principles  that,  employed  properly,  lead  to  the 
attainment  of  those  objectives.  In  tur*.  the  application  of  those  principles  within  a structuied 
software  development  process  will  yield  a product  that  exhibits  desirable  attributes.  The  important 
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Figure  2 

Linkages  Among  the  Objectives,  Principles  and  Attributes 


correspondence  between  the  objectives  and  principles  and  between  the  principles  and  attributes  is 
shown  in  Figure  2;  a literature  confirmation  of  these  relationships  is  discussed  in  [ARTJ87J. 

The  adequacy  of  asoftware  development  methodology  can  be  defined  as  its  ability  to  achieve  the 
software  engineering  objectives  corresponding  to  those  dictated  by  system  needs  and  requirements. 
Intuitively,  the  adequacy  of  a methodology  is  assessed  through  a top-down  evaluation  scheme 
starting  with  an  examination  of  stated  methodological  objectives  relative  to  system  needs  and 
requirements.  This  step  is  then  followed  by  a comparison  of  stated  methodological  principles 
and  attributes  with  those  deemed  most  appropriate.  An  examination  of  linkages  defined  by  the 
evaluation  procedure  reveals  the  *most  appropriate”  set.  Relative  to  the  framework  depicted  by 
Figure  1 and  the  sets  of  linkages  defined  in  Figure  2,  an  application  of  the  evaluation  procedure  to 
the  assessment  of  methodological  adequacy  is  outlined  below. 

8 

R.  Nance 

VPI 

Page  S of  46 


< 


4 


Objectives  cf  the  Methodology:  The  identification  of  objectives  and  the  relationships  tying  objec- 
tives to  needs  and  requirements  is  usually  accomplished  by  reading  the  descriptions  of  a software 
development  methodology.  Evaluation  at  this  level  is  quite  subjective;  however,  the  absence  of  a 
clear  statement  of  objectives  for  a methodology  should  trigger  an  alarm:  Is  the  “methodology" 
only  a tool  or  an  incomplete  set  of  tools  without  coherent  structure?  A methodology  should  not 
be  faulted,  however,  for  emphasizing  certain  objectives  at  the  expense  of  others;  such  prioritization 
can  be  highly  dependent  on  the  application  domain. 

Principles  Defining  the  Process:  Based  on  the  objectives  emphasized  by  the  methodology  and  the 
predefined  set  of  linkages  among  objectives  and  principles,  the  next  step  in  assessing  the  adequacy 
of  a methodology  is  an  investigation  of  the  software  development  process.  That  is,  given  a stated  set 
of  methodological  objectives,  one  asks:  Are  the  principles  supported  by  the  methodology  consistent 
with  those  deemed  necessary  (through  linkage  examination)  to  achieve  the  stated  set  of  objectives? 
The  presence  of  principles  without  corresponding  objective(s)  or  vice  versa  should  evoke  concerns. 
.Although  this  level  of  evaluation  is  inherently  subjective,  some  analytical  quality  is  introduced 
through  the  established  objective/principle  correspondence. 

Attributes  of  the  Product:  The  third  step  in  the  assessment  process,  formulating  the  set  of  expected 
product  attributes,  is  based  on  the  fact  that  principles  govern  the  process  by  which  a software 
product  is  produced.  That  is,  a given  set  of  principles  should  induce  a consequent  set  of  product 
attributes.  Obviously,  the  expected  set  of  product  attributes  should  correspond  to  those  desired  by 
the  software  engineer,  and  to  some  extent,  be  implied  or  stated  in  the  description  of  the  software 
development  methodology.  More  objectivity  is  introduced  at  this  level  because,  although  intangible, 
evidence  of  the  attributes  should  be  discernible  in  the  product. 

4.2  Assessing  the  Effectiveness  of  a Methodology 

While  a top-down  evaluation  process  reveals  deficiencies  of  a software  development  method- 
ology, the  effectiveness  of  a methodology  is  assessed  through  a bottom- up  evaluation  process.  As 
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the  term  implies,  the  effectiveness  of  a methodology  is  defined  as  the  degree  to  which  a method- 
ology produces  a desired  result.  In  particular,  the  effectiveness  of  a methodology  is  reflected  by 
a product’s  conformance  to  the  software  development  process  denned  by  that  methodology.  We 
note,  however,  that  elements  independent  of  the  methodology  can  influence  its  effectiveness,  e.g. 
an  inadequate  understanding  and/or  use  of  the  methodology. 

The  Existence  of  Product  Attributes:  Assessing  methodological  effectiveness  starts  with  an  exami- 
nation of  the  software  product  (code  and  documentation)  for  the  presence  or  absence  of  attributes. 
Because  attributes  are  intangible,  subjective  qualities,  the  current  evaluation  is  based  0:1  defined 
properties  that  provide  evidence  as  to  the  presence  or  absence  of  attributes.  More  specifically, 
the  computation  of  metric  values  reflect  the  extent  to  which  particular  properties  are  observed. 
In  turn,  this  information  is  used  to  synthesize  the  extant  set  of  product  attributes.  Clearly,  the 
set  of  attributes  determined  from  a product  evaluation  should  agree  with  those  induced  by  the 
corresponding  development  methodology.  Set  mismatch  can  imply  an  inappropriate  software  de- 
velopment methodology,  an  inadequate  understanding  of  the  methodology,  or  perhaps,  the  failure 
of  users  to  adhere  to  the  principles  advocated  by  the  methodology. 

Implied  Principles  and  Objectives:  Knowing  which  attributes  are  present  in  the  product,  and 
the  extent  to  which  they  are  assessed  present,  provides  a basis  for  implying  the  use  of  software 
engineering  principles  in  the  software  development  process.  The  rationale  for  such  a statement  is 
based  on  the  observation  that  a principle-to-attribute  linkage  conversely  indicates  an  attribute-to- 
principle  relationship.  Implying  principle  usage  must  be  tempered,  however,  because  of  the  many- 
to-one  relationships  existing  between  attributes  and  principles.  Similarly,  using  the  established 
linkages  among  objectives  and  principles,  one  can  speculate  on  the  achievement  of  stated  software 
engineering  objectives 

In  summary,  the  three  levels  of  examination  defined  by  top-down  e%raluacion  process  establishes 
a procedure  for  determining  how  well  a methodology  can  support  perceived  needs,  requirements, 
and  the  software  development  process.  Conversely,  the  bottom-up  evaluation  process  reveals  how 
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Figure  3 

Illustration  of  the  Evaluation  Process 

well  the  methodology  is  applied  in  the  software  development  process  through  the  use  of  f yuantiiain't 
measures  to  support  an  objective,  qualitative  assessment. 

4.3.  An  Illustration  of  the  Evaluation  Process 

To  illustrate  how  the  evaluation  scheme  can  be  applied,  we  refer  the  reader  to  Figure  3 while 
considering  the  single  objective  of  maintainability.  Formally,  maintainability  can  be  defined  as  :he 
ease  with  which  maintenance  can  be  performed  to  a functional  unit  in  accordance  with  prescribed 
requirements.  Accepting  maintainability  as  an  objective  mandates  the  inclusion  of  six  principles 
(hierarchical  decomposition,  functional  decomposition,  information  hiding,  stepwise  refinement. 
v structured  programming  and  concurrent  documentation)  contributing  to  the  realization  of  that 
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objective.  That  is,  if  a methodology  emphasizes  maintainability  as  an  objective,  then  it  should 
also  stress  the  use  of  the  six  principles  that  are  related  to  maintainability. 

Expanding  on  one  of  those  principles,  information  hiding,  we  note  the  five  attributes  (reduced 
coupling,  enhanced  cohesion,  well-defined  interfaces,  ease  of  change,  and  low  complexity)  that 
should  be  evident  In  software  developed  using  a process  governed  by  the  principle  of  information 
hiding.  This  set  of  expected  attributes  is  then  compared  to  the  desired  set  for  correspondence. 

In  general,  a methodology  should  emphasize  the  same  set  of  software  engineering  objectives 
derived  from  project  level  requirements.  The  methodology  should  correspondingly  stress  the  set  of 
principles  linked  to  those  objectives.  Additionally,  the  expected  set  of  product  attributes  (defined 
by  the  linxages  among  principles  and  attributes)  should  agree  with  the  set  deemed  most  desirable 
by  the  project  manager.  If  the  above  conditions  are  met,  then  the  candidate  methodology  is 
assumed  to  be  adequate  relative  to  project  level,  software  engineering  objectives. 

On  assessing  the  effectiveness  of  an  methodology,  let  us  first  observe  the  relationship  between  a 
particular  attribute  and  specific  product  characteristics.  Referring  again  to  Figure  3,  and  narrowing 
our  attention  to  one  of  the  attributes,  well-defined  interfaces,  we  identify  one  set  of  characteristics 
related  to  the  well-defined  interfaces  attribute.  These  characteristics  form  the  set  of  observable 
properties  which  contribute  to  the  claim  that  a piece  of  software  exhibits  a well-defined  interface. 
Although  the  properties  shown  are  only  a subset  of  those  previously  identified  [ARTJ86],  they 
represent  bcth  confirming  and  contrasting  elements.  For  example,  the  use  of  global  variables  has  a 
negative  impact  on  well-defined  interfaces.  The  use  of  structured  data  in  parameter  calls,  however, 
has  a positive  impact. 

Hence,  to  determine  the  effectiveness  of  a methodology  one  assesses  the  extent  to  which  product 
attributes  are  present  (or  absent),  and  then  propagate  the  results  of  that  assessment  through  the 
sets  of  linkages  defined  by  the  evaluation  procedure.  As  discussed  by  Kearney  [KEAJ86],  however, 
that  assessment  process  must  be  predicated  on  validated  metrics. 
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5.  Application  of  the  Evaluation  Procedure 


Based  on  the  denned  set  of  linkages  among  objectives,  principles,  and  attributes,  the  opera- 
tional specification  of  the  evaluation  procedure  is  guided  by  two  fundamental  axioms: 

(1)  the  methodology  description  and  project  requirements  provide  standards,  conventions, 
and  guidelines  that  describe  how  to  produce  a product,  and 

(2)  the  project  documentation,  code,  and  code  documentation  reflect  how  well  the  develop- 
ment process  prescribed  by  the  methodology  is  followed. 

As  described  below,  an  application  of  the  evaluation  procedure,  guided  by  the  above  two 
axioms,  illustrates  the  utility  and  intrinsic  p rower  of  the  evaluation  procedure  in  assessing  the 
adequacy  and  effectiveness  of  a methodology.  Provided  in  this  illustration  is  a characterization 
of  the  components  used  in  the  evaluation  process,  an  individual  assessment  of  two  methodology 
descriptions,  an  analysis  of  associated  products,  and  a summary  of  the  results.  The  authors 
note  that  a substantial  part  of  the  characterization  and  assessment  process  is  embodied  in  the 
operational  aspects  of  applying  the  evaluation  procedure.  Length  restrictions,  however,  prevent 
their  discussion.  For  information  on  the  operational  aspects  the  authors  refer  the  interested  reader 
to  [ARTJ86,  ARTJ87J. 

5.1  Data  Sources 

A joint  investigation  of  two  comparable  Navy  software  development  methodologies  and  respec- 
tive products  is  detailed  in  [NANR85].  The  investigation  effort  utilizes: 

• Four  software  development  methodology  documents  for 

(1)  identifying  the  pronounced  software  engineering  objectives,  principles,  and  attributes, 
and 
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(2)  assessing  the  adequacy  of  each  methodology  through  the  objective/prindple/attribute 
linkages  defined  by  the  evaluation  procedure,  and 

• Eight  software  system  documents  and  118  routines,  comprising  S300  source  lines  of  code, 
for 

(1)  determining  the  evident  set  of  product  attributes,  and 

(2)  via  the  attribute/prindple/objective  linkages,  empirically  assessing  the  prindples  and 
objectives  emphasized  during  product  development. 


The  following  section  provides  a summary  of  the  results  and  illustrates  the  utility  aac  crsatiiity 
of  the  procedural  approach  to  evaluating  software  development  methodologies.  For  simplicity,  we 
refer  to  the  software  systems  as  system  A and  system  B (and  methodology  A,  methodology  B, 
respectively). 

5.2  Analyzing  the  Methodological  Description  and  Associated  Product 

The  initial  step  in  the  evaluation  process  is  to  perform  a “top-down"  analysis  of  methodologies 
A and  B,  to  reveal  the  set  of  claimed  software  engineering  objectives,  principles,  and  attributes. 
Because  both  methodologies  have  experienced  evolutionary  development,  a clear  statement  of  their 
respective  methodological  objectives  is  lacking.  Nonetheless,  as  detailed  in  Figure  4,  the  docu- 
mentation for  methodology  A does  appear  to  stress  the  objectives  of  reliability  and  correctness 
supported  by  the  principles  of  structured  programming,  hierarchical  decomposition , and  functional 
decomposition.  Following  the  objective/ principle  relationships  defined  by  the  evaluation  proce- 
dure, for  each  objective  stressed  in  methodology  A only  three  of  the  necessary  six  principles  are 
emphasized.  The  implication  is,  that  unless  the  principles  of  life-cycle  verification,  information 
hiding  and  stepwise  refinement  are  implicitly  assumed  and  utilized,  correctness  and  reliability,  are 
compromised. 
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Reliability 

Yes 

Yes 

Portability 

Adaptability 

Yes 

Principles 

Hierarchical  Decomposition 

Yes 

Functional  Decomposition 

Yes 

Information  Hiding 

Stepwise  Refinement 

Structured  Programming 

Yes 

Yes 

Concurrent  Documentation 

Yes 

Life  Cycle  Verification 

Attributes 

None 

None 

Figure  4 

Pronounced  Objectives,  Principles,  and  Attributes 


Using  metric  values  and  properties,  a corresponding  Ubottora-up”  examination  of  product  A 
provides  some  interesting  results.  The  Kiviat  graph  displayed  in  Figure  5a  illustrates  the  extent  to 
which  each  attribute  is  assessed  as  present  in  the  product.  (Attribute  ratings  are  restricted  to  an 
arbitrarily  chosen  1-10  scale.)  Note  that  (reduced)  complexity  attains  the  highest  rating  - 8.0  out  chi' 
10.0,  closely  followed  by  readability  (7.4)  and  cohesion  (6,8).  Based  on  the  three  principles  stressed 
in  methodology  A,  the  evaluation  procedure  predicts  that  (reduced)  complexity,  readability,  and 
cohesion  should , in  fact,  be  among  the  product  attributes. 


In  concert  with  the  stated  objectives  and  principles  for  methodology  A,  Figure  5b  reveals  that 
structured  programming  (7.7)  is  the  prominent  principle  used  in  developing  system  A.  followed 
by  stepwise  refinement  (6.7),  hierarchical  decomposition  (6.4),  and  functional  decomposition  (6.4). 
Figure  5c  depicts  the  results  of  emphasizing  these  principles  in  terms  of  methodology  objectives.  In 
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Early  Error 
Detection 


Coupling 


Visibility 
of  Behavior 


Traceability 


Cohesion 


Complexity 


Ease  of 
Change 


Readability 


(a) 


Hierarchical 

Decomposition 


Well-Defined 

Interface 


Concurrent 

Documentation 


Life-cycle 

Verification 


Functional 

Decomposition 


Information 

Hiding 


Structured 

Programming 


Stepwise 

Refinement 


(b) 


Methodology  A 
Mchodology  B 


Maintainability 


Figure  5 

Detected  Presence  of  Objectives,  Principles,  and  Attributes 

particular,  reliability  is  rated  as  the  major  software  development  objective  (6.7).  Alt  hough  correct- 
ness is  also  stressed  by  methodology  A,  ascertaining  correctness  necessitates  life-cycle  verification. 
This  principle  is  neither  emphasized  by  methodology  A,  nor  evident  in  the  software  product.  As 
illustrated  by  Figures  5a,  5b  and  5c,  other  objectives  and  principles  are  given  some  emphasis  during 
the  software  development  process  for  system  A.  It  is  the  authors’  opinions,  however,  that  because 
they  are  not  explicitly  stressed  in  methodology  A,  the  associated  product  suffers. 

For  methodology  B,  the  objectives  enunciated  in  the  documentation  are  maintainability , adapt - 
ability , and  reliability . Structured  programming  and  concurrent  documentation  are  the  empaasized 
principles.  Like  methodology  A,  however,  a complete  set  of  supporting  principles  are  not  stated. 
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Hierarchical  decomposition , functional  decomposition,  and  to  some  extent  in/ormoXxon  hiding  are 
implicitly  assumed  as  underlying  principles  of  methodology  B.  According  to  the  linkages  among 
objectives  and  principles,  all  of  the  above  principles  (both  stated  and  assumed)  are  required  to 
achieve  the  objectives  explicitly  stated  in  methodology  B. 

Subsequent  analysis  of  product  B and  a abottora-up"  propagation  of  the  results  through  the 
linkages  defined  by  the  evaluation  procedure  reveals  structured  programming  as  the  most  prominent 
principle  (8.3),  closely  followed  by  concurrent  documentation  (7.0).  Moreover,  the  evaluation  also 
indicates  that  the  implicitly  assumed  principles  of  methodology  B are  utilized  - stepwise  refinement, 
hierarchical  decomposition,  functional  decomposition,  and  information  hiding  rate  6.9,  6.7,  6.7.  and 
6.3,  respectively.  Finally,  the  results  imply  that  during  the  development  of  product  B the  objectives 
of  maintainability,  adaptability,  and  reliability  are  most  emphasized.  The  above  assessments  are 
illustrated  in  Figures  5a,  5b,  and  5c. 

To  summarize,  the  evaluation  procedure  reveals  that  both  methodologies  lack  a clear  statement 
of  goals  and  objectives,  as  well  as  sufficient  principles  for  achieving  the  objectives  that  are  em- 
phasized. Moreover,  glaring  deficiencies  are  apparent  in  both  software  development  methodologies. 
That  is,  both  fail  to  actively  support  the  principle  of  information  hiding  and  also  have  difficulties 
in  incorporating  the  desirable  attributes  of  traceability  and  well-defined  interfaces  in  respective 
system  products.  In  general,  the  evaluation  procedure  does  accurately  assesses  the  software  engi- 
neering objectives,  principles,  and  attributes  espoused  by  methodologies  A and  B.  Of  particular 
significance,  however,  is  that  the  objectives  and  principles  determined  to  be  “emphasized"’  during 
the  product  development  process,  yet  not  stated  in  the  methodology  documentation,  are  precisely 
those  that  are  implicitly  assumed  important  by  the  software  engineers  developing  products  A and 
B.  A more  detailed  account  of  the  evaluation  can  be  found  in  [NANR85]. 

6.  Conclusion 

Tods,  techniques,  environments,  and  methodologies  dominate  the  software  engineering  lit- 
erature, but  relatively  little  research  in  the  evaluation  of  methodologies  is  evident.  This  work 
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reports  an  initial  attempt  to  develop  a procedural  approach  to  evaluating  software  development 
methodologies.  Prominent  in  this  approach  are: 

(1)  an  explication  of  role  of  a methodology  in  the  software  development  process, 

(2)  the  development  of  a procedure  based  on  linkages  among  objectives,  principles,  and 
attributes,  and 

(3)  the  establishment  of  a basis  for  reduction  of  the  subjective  nature  of  the  evaluation  through 
the  introduction  of  properties. 

An  application  of  the  evaluation  procedure  to  two  Navy  methodologies  has  provided  consistent 
results  that  demonstrate  the  utility  and  versatility  of  the  evaluation  procedure  [NANR85].  Current 
research  efforts  focus  on  the  continued  refinement  of  the  evaluation  procedure  through 

(a)  the  the  identification  and  integration  of  product  quality  indicators  reflective  of  attribute 
presence,  and 

(b)  the  validation  of  metrics  supporting  the  measure  of  those  indicators. 

The  consequent  refinement  of  the  evaluation  procedure  offers  promise  of  a flexible  approach  that 
admits  to  change  as  the  field  of  knowledge  matures.  In  conclusion,  the  procedural  approach 
presented  in  this  paper  represents  a promising  path  toward  the  end  goal  of  objectively  evaluating 
software  engineering  methodologies. 
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THE  ORIGIN 


Issue: 


Immediate  Software  Development  Issues 

for 

Embedded  Systems  Applications 
in  Surface  Combatants 
(25  March  - 15  September  1985) 


Multiple  Software  Development  Methodologies 

(a ) Review  two  software  development  “methodologies”  (A  and  B) 

05)  Compare  and  evaluate  A and  B 

(c)  Assess  the  costs  and  benefits  of: 

• Continuing  with  multiple  software  development 
methodologies 

• Using  only  one  software  development  methodology 

• Transitioning  to  an  alternate  software  development 
methodology 

A ==>  B 

B =>  A 

A/B  =>  ??? 
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OUTLINE 


Evaluation  Approach 

- Objectives,  Principles,  Attributes  Framework 

Development  of  an  Evaluation  Procedure 

- Software  Engineering  and  Software  Development 

- A Structured  Evaluation  Procedure 

- Data  Sources 

Application  of  the  Evaluation  Procedure 

- Summary  of  Sample  Data 

- Illustration  of  Procedure  Application 

Summary  of  Results 
Future  Work 
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EVALUATION  APPROACH 


1 . Develop  an  Evaluation  Procedure 

- What  is  a methodology? 

- How  can  they  be  compared? 

2.  Apply  the  Evaluation  Procedure 

- In  consonance  with  our  Navy  sponsor,  and  with 

- Contributions  from  software  development  sites 
and  oversight  agencies. 
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ON  METHODOLOGIES 


What  is  a methodology? 


A methodology  is  a collection  of  complementary  methods, 
and  a set  of  rules  for  applying  them.  More  specifically,  a 
methodology 


(1)  organizes  and  structures  tasks  comprising  the  effort  to 
achieve  a global  objective,  establishing  the 
relationships  among  tasks, 


(2)  defines  methods  for  accomplishing  individual  tasks 
(within  the  context  of  the  global  objective),  and 


(3)  prescribes  an  order  in  which  certain  classes  of 
decisions  are  made,  and  ways  of  making  those 
decisions  that  lead  to  the  desired  objective. 


RATIONALE  FOR 
THE  EVALUATION  PROCEDURE 


A set  of  objectives  can  be  identified  that  include  those 
postulated  by  any  software  engineering  methodology.  A 
methodology  defines  those  principles  that  characterize  a 
proper  and  appropriate  development  process.  Adherence  to  a 
process  governed  by  these  principles  should  result  in  a 
product  (programs  and  documentation)  that  possesses 
attributes  considered  desirable  and  beneficial. 


Philosophy  tempered  by  practical  concerns: 

(1)  Sets  of  objectives,  principles,  attributes  are^open. 

(2)  Prioritization  of  objectives  recognized. 

(3)  Components  of  large  software  system  vary  - attribute 
sampling. 

(4)  Flexible  application  of  evaluation  procedure  - 
consonant  with  project  objectives. 


framework  for  software  development 


OBJECTIVES 

Maintainability 

Correctness 

Reusability 

Testability 

Reliability 

Portability 

Adaptability 


PRINCIPLES 

Hierarchical  Decomposition 
Functional  Decomposition 
Information  Hiding 
Stepwise  Refinement 
Structured  Programming 
Life-Cycle  Verification 
Concurrent  Documentation 


ATTRIBUTES 

Reduced  Coupling 
Enhanced  Cohesion 
Reduced  Complexity 
Well-Defined  Interfaces 
Readability 
Ease  of  Change 
Traceability 
Visibility  of  Behavior 
Early  Error  Detection 
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PROCEDURE  DEVELOPMENT 


w 


% 


1.  Identify  Objectives 

- What  qualities  are  desirable? 

2.  Define  Principles 

- How  are  desirable  qualities  obtained? 

3.  Link  Principles  to  Objectives 

- Which  principles  contribute  to  each  objective? 

4.  Define  Resulting  Attributes 

- Use  of  a principle  induces  what  desirable  attributes? 

5.  Define  Properties  Associated  with  Attributes 

- What  properties  give  evidence  of  attribute  presence  or 
absence? 

- How  to  measure  properties? 

% 
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PRIMARY  SOFTWARE  ENGINEERING 
OBJECTIVES 


(1)  Adaptability  - the  ease  with  which  software  can 

accommodate  to  changing  requirements 

(2)  Correctness  - strict  adherence  to  specifications 

(3)  Maintainability  - the  ease  with  which  corrections 

can  be  made  to  respond  to  recognized 
inadequacies 

(4)  Portability  - the  ease  in  transferring  software  to 

another  host  environment 


(5)  Reliability  - the  error-free  behavior  of  software  over 

time 

(6)  Reusability  - the  use  of  developed  software  in  other 

applications 


(7)  Testability  - the  ability  to  evaluate  conformance 
with  specifications 
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PRIMARY  SOFTWARE  ENGINEERING 
PRINCIPLES 


(1)  Abstraction  - defining  each  program  segment  at  a 
given  level  of  refinement 

(a)  Hierarchical  Decomposition  - components 

defined  in  a top-down  manner 

(b)  Functional  Decomposition  - components 

partitioned  along  functional  boundaries 


(2)  Concurrent  Documentation  - management  of 

supporting  documents  (system  specifications,  user 
manuals,  etc)  throughout  the  life  cycle 


(3)  Information  Hiding  - insulating  the  internal  details 
of  component  behavior 


(4)  Life  Cycle  Verification  - verification  of 

requirements  throughout  the  design,  development, 
and  maintenance  phases  of  the  life  cycle 


(5)  Stepwise  Refinement  - utilizing  convergent  design 


(6)  Structured  Programming  - using  a restricted  set  of 
program  control  constructs 
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OBJECTIVES  / PRINCIPLES  LINKAGES 


Adaptability 


Correctness 


Maintainability 


Portability 


Reliability 


Reusability 


Testability 


Concurrent  Documentation 


Functional  Decomposition 


Hierarchical  Decomposition 


Information  Hiding 


Life  Cycle  Verification 


Stepwise  Refinement 


Structured  Programming 
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PRIMARY  SOFTWARE  ENGINEERING 
ATTRIBUTES 


V 


>4 


(1)  Cohesion  - The  binding  of  statements  within  a 

software  component 

(2)  Complexity  - an  abstract  measure  of  work 

associated  with  a software  component 

(3)  Coupling  - the  interdependence  among  software 

components 

(4)  Early  Error  Detection  - indication  of  faults  in 

requirements,  specification  and  design  prior  to 
implementation 

(5)  Ease  of  Change  - software  that  accommodates 

enhancements  or  extensions 

(6)  Readability  - the  difficulty  in  understanding  a 

software  component 

(7)  Traceability  - the  ease  in  retracing  the  complete 

history  of  a software  component  from  its  current 
status  to  its  design 

(8)  Visibility  of  Behavior  - the  provision  of  a review 

process  for  error  checking 

(9)  Well-Defined  Interfaces  - the  definitional  clarity 

and  completeness  of  a shared  boundary  between 
software  and/or  hardware  (software/software, 
software/hardware) 
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PRINCIPLES  / ATTRIBUTES  LINKAGES 


Concurrent 

Documentation 


Functional 

Decomposition 


Hierarchical 

Decomposition 


Information  Hiding 


Life  Cycle 
Verification 


Stepwise 

Refinement 


Structured 

Programming 


Cohesion 


Complexity 


Coupling 


Early  Error  Detection 


Ease  of  Change 


Readability 


Traceability 


Visibility  of  Behavior 


Well-Defined  Interfaces 
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ILLUSTRATION  OF  THE  EVALUATION  PROCEDURE 


Hierarchical 

Functional 

Informaition 

Stepwise 

Structured 

Concurrent 

j iSJ*l  i » w * i 

Decomposition 

Hiding 

Refinement 

Programming 

Documentation 

Coupling 


Cohesion 


Well-Defined  | 
Interfaces 


Ease  of 
Change 


Complexity 


Use  of  Global 
Variables 


Use  of 
Parameters 


Parameterless 

Calls 


Excessive  # 
of  Parameters 


Use  of  Data 
Structures 
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SETS  OF  DEFINED  LINKAGES 


Coupling  Cdmnou  Complexity  I I 


Ease  of 
Change 


Traceability 


Visibility  of 
Behavior 


t t 


Control 

Structures 

GOTO’s 

Special 

Characters 

Code 

Indentation 

Symbolic 

Constants 

# Block 
Comments 

Documentation 

Readability 

Meaningful 

Names 

Completion  <fc 
Acanacyof 
Documentation 

# Single  Line 
Comments 
In  Line* 

Embedded 

Alternate 

Language 

Correct 
Grammar  & 
Spelling 

Short 

Modules 

Consistent 

Comments 
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THE  OPA  FRAMEWORK  FOR  EVALUATION: 

SUMMARY 


Fundamental  to  the  evaluation  procedure  are  several  sets  of 
linkages: 


Linkages 

Defined 

Substantiated 

• Objectives  / Principles 

(33) 

(33) 

• Principles  / Attributes 

(24) 

(24) 

• Attributes  / Properties 

(125) 

(114) 

66  Automatable 

Assessing  the  adequacy  of  a methodology  is  achieved  through  a 
"top-down"  evaluation  process. 


Assessing  the  effectiveness  of  a methodology  is  achieved  through 
a "bottom-up"  evaluation  process. 
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APPLICATION  OF  THE  PROCEDURE: 
SUMMARY  OF  SAMPLE  DATA 


A:  The  Combat  System  Development  Plan 

The  Computer  Programming  Manual 
The  Program  Development  Manual 
Six  Numbered  Documents  (PDS,  IDS) 

B:  Functional  Description  Document 

Two  Numbered  Documents  (PDS,  IDS) 


Source  Code: 

A:  Routines  = 17  SLOCS  =1170 

SysProcs  = 2 SLOCS  = 1370 

B:  Routines  = 99  SLOCS  = 5729 
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DATA  SOURCES  AND  IMPLICATIONS 


Methodology 

Description 

Project 

Requirements 


Standards 

Conventions 

Guidelines 


How  to  r 
doit  l 


Project 

Documentation 
Code  and 

Code  Documentation 


PPS 

jj)S  How  well 

pj)S  is  it  done 

Programs 


Objectives 

Principles 


Principles 

Attributes 


AN  ACCUMULATION  OF  EVIDENCE 


"Demonstrating  that  software  possesses  a desired  attribute 
(or  does  not)  is  not  a proof  exercise;  rather,  it  resembles 
an  exercise  in  civil  litigation  in  that  evidence  is  gathered 
to  support  both  contentions  (the  presence  or  absence)  and 
weighed  on  the  scales  of  comparative  judgement." 


Measurement  Scale 


ELEMENTS,  METRICS  AND  PROPERTIES 


Relationship 


Code  Example 


Documentation 

Example 
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ASSESSING  "METHODOLOGICAL"  EFFECTIVENESS 

(ATTRIBUTES) 


A 

B 

Coupling 

5.4 

5.9 

Cohesion 

6.8 

6.4 

Complexity 

8.0 

8.4 

Well-Defined  Interfaces 

47 

18 

Readability 

7.4 

8.2 

Ease  of  Change 

5.6 

6.0 

Visibility  of  Behavior 

5.6 

5.8 

Early  Error  Detection 

5.6 

5.8 

Traceability 

L2 

53 

Both  have  difficulty  with  Traceability  and  Well-Defined 
Interfaces 
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KIVIAT  GRAPH  FOR  ATTRIBUTES 


Cohesion 


Methodology  A 

Methodology  B — ■ 


Hierarchical 

Decomposition 


KIVIAT  GRAPH  FOR 


Methodology  A 
Methodology  U 


RESULTS  OF  PROCEDURE  APPLICATION 


Assessing  "Methodological"  Adequacy 


A:  Stresses  Objectives  of  Reliability  and  Correctness 

Emphasizes  Principle  of  Structured  Programming 

Methodology  A was  (and  is)  an  "evolving 
methodology" 


B:  Stresses  Objectives  of  Maintainability, 

Adaptability,  Reliability,  and  Correctness 
Emphasizes  Principles  of  Modular  Decomposition, 
Structured  Programing  and  Concurrent 
Documentation 


At  the  Objectives  level,  both  “methodologies”  support  stated 
project  objectives. 


At  the  Principles  level,  both  “methodologies”  lack  the 
enunciation  of  proper  Principle  usage. 


No  reference  to  desired  Attributes  is  found 
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FUTURE  RESEARCH 

Applying  the  Evaluation  Procedure  to 
Software  Quality  Assurance 


Predicting  and/or  assessing  software  quality  necessitates  a 

• Systematic  approach  to 

• Assessing  product  (or  process)  conformance  with 

• Acceptance  criteria  (standards  and  guidelines) 


The  Evaluation  Procedure 

• Currently  supports  a well-defined,  systematic 

approach  for  evaluating  software  products,  and 

• Provides  a rigorous  framework  for 

- Relating  acceptance  criteria  based  on  attributes 
to  software  engineering  principles  and 
objectives,  and 

- Defining  acceptance  levels  based  on  measures 
reflecting  the  achievement  of  objectives, 
adherence  to  principles  and  realization  of 
attributes. 
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ABSTRACT  ^ 

This  paper  will  define  the  framework  of  a general  method  for  selecting  a necessary  and  sufficient  subset  of  a general 
software  life  cycle's  information  products,  to  support  new  software  development  projects.  Procedures  for  character- 
izing problem  domains  in  general  and  mapping  to  a tailored  set  of  life  cycle  processes  and  products  will  be  given.  An 
overview  of  the  method  is  shown  using  the  following  steps: 

1.  During  the  problem  concept  definition  phase,  perform  standardized  interviews  and  dialogs  between  devel- 
oper and  user,  and  between  developer  and  customer. 

2.  Generate  a quality  needs  profile  of  the  software  to  be  developed,  based  on  information  gathered  in  step  1. 

3.  Translate  the  quality  needs  profile  into  a profile  of  quality  criteria  that  must  be  met  by  the  software  to  satisfy 
the  quality  needs. 

4.  Map  the  quality  criteria  to  a set  of  accepted  processes  and  products  for  achieving  each  criterion. 

i.  Select  the  information  products  which  match  or  support  the  accepted  processes  and  product  of  step  4. 

6.  Select  the  design  methodology  which  produces  the  information  products  selected  in  step  5. 

This  paper  will  address  every  step,  but  will  not  attempt  to  generate  a full-up  methodology.  A few  of  the  more  popular 
process  models  and  design  methodologies  known  today  will  be  examined  for  their  information  content. 


TERMINOLOGY  NOTES 

The  terms  "software  process  model"  and  "life  cycle"  will  be  used  interchangeably.  The  term  "user"  will  always  mean 
"customer  and  user". 


INTRODUCTION 

The  complete  set  of  information  products  defined  for  common  software  process  models  and  development  method- 
ologies is  often  too  large  fbr  certain  development  efforts.  In  many  cases,  a subset  of  informauon  products  and  the 
activities  that  produce  them  will  suffice  to  administer  the  development  of  a software  product.  The  act  of  selecting 
appropriate  information  products  and  activiues  to  support  the  development  effort  is  called  "tailoring"  the  life  cycle 
or  development  methodology.  This  tailoring  process  is  currently  an  ad  hoc  method  performed  by  managers  and 
developers,  in  early  meetings  with  the  customer  and  user,  as  they  begin  to  define  some  sort  of  Software  Management 
or  Development  Plan.  This  paper  explores  a more  formalized  tailoring  method  to  assist  in  the  definition  of  such 
plans.  It  is  hoped  that  such  a formalization  will  both  speed  the  process  and  help  ensure  the  selection  of  a necessary 
and  sufficient  subset  of  information  products  (and  by  implicauon.  the  acuvtues  which  produce  them). 

The  cornerstone  of  this  tailoring  method  uses  Software  Quality  Assurance  (SQA)  techniques.  Traditionally.  SQA 
has  dealt  with  the  detection  and  prevention  of  defecuve  software.  New  ideas  in  the  field  of  SQA  are  concentraung  on 
beginning  the  function  much  earlier  in  the  life  cycle,  as  early  as  problem  concept  and  initial  requirements  definition 
It  is  hoped  that  SQA  principles  will  assist  the  user  and  developer  in  creating  complete,  consistent  and  testable 
require n^ents^This  assistance  offers  guidelines  up  front  when  we're  scrambling  to  put  some  sensible  words  on  paper 
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1 believe  that  two  quotes  [5],  (21)  can  illustrate  the  idea  of  ’engineering  in*  quality  to  a software  product. 


You  can’t  achieve  Quality... 
unless  you  specify  it! 


Quality  must  be  defined  as  conformance 
to  requirements,  not  as  ’’goodness” 


USING  SQA  TECHNIQUES  TO  SPECIFY  QUALITY 


Quality  Factors 

This  is  a common  SQA  term.  Quality  Factors  are  characteristics  which  a software  product  exhibits  that  reflect  the 
degree  of  acceptability  of  the  product  to  the  user.  Since  we’re  moving  SQA  up  front,  we'll  restate  this:  Quality 
Factors  are  characteristics  which  the  user  requires  the  software  to  exhibit  in  order  to  reflect  the  best  possible  degree 
of  acceptability. 

Table  1 shows  a list  of  Quality  Factors  which  has  been  coming  into  general  use  for  some  time  [21].  It  was  first 
proposed  at  the  Rome  Air  Development  Center  (RADC)  in  1977.  I show  a slightly  expanded  list,  as  it  has  evolved 
somewhat  since  then  [5]. 

There  are  more  detailed  meanings  of  the  quality  factors  which  guide  the  user  developer  in  determining  how 
important  each  factor  is  for  their  application. 

Not  every  project  requires  all  quality  factors,  which  is  good,  because  some  quality  factors  are  at  conflicting  purpose. 
Shown  below  is  a list  of  factors  whose  characteristics  cause  conflicts  of  definition. 

quality  Factor  Conflict Ewianaticn  oL QOnDiCL 

Efficiency  vs.  Integrity Overhead  required  to  control  access  negates  efficiency. 

Efficiency  vs.  Usability Overhead  required  to  ease  operations  negates  efficiency. 

Efficiency  vs.  Maintainability— Optimized  code  negates  maintainability.  Modularization,  instrumentation 

and  well  commented  high-level  code  increases  overhead. 

Efficiency  vs.  Testability Optimized  code  negates  testability. 

Efficiency  vs.  Portability Optimized  code  is  dependent  on  host  processor  services. 

Efficiency  vs.  Flexibility Overhead  required  to  support  flexibility  negates  efficiency. 

Efficiency  vs.  Reusability  ■ — Overhead  required  to  support  reusability  negates  efficiency. 

Efficiency  vs  Interoperability— Overhead  required  to  support  interoperability  negates  efficiency. 

Integrity  vs.  Flexibility  Flexibility  requires  general  ana  flexible  data  structures,  increasing  data 

security  problems. 

Integrity  vs.  Reusability  — Generality  required  by  reusable  software  introduces  protection  problems. 
Integrity  vs.  Interoperability Coupled  systems  allow  more  avenues  of  access. 

Reusability  vs.  Reliability  — Generality  required  by  reusable  software  increases  difficulty  of  providing 

error  tolerance  (anomaly  management)  and  accuracy. 

The  conflicts  shown  do  not  mean  that  the  two  factors  are  in  strict  mutual  exclusion  — extra  effort  may  be  expended 
to  address  the  difficulties  of  specifying  factors  in  conflict.  Note  that  efficiency  tends  to  conflict  with  many  other 
factors.  This  is  due  to  the  tradeoff  with  the  additional  overhead  required  to  satisfy  other  quality  factors  that  does  not 
necessarily  apply  tc  the  algorithm's  basic  function.  Efficiency  issues  may  also  be  resolved  by  judicious  hardware 
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Correctness 

— Conformance  of  software  design  and  implementation  to  stated  require- 
ments. 

Efficiency 

— Economy  of  resources  needed  to  provide  the  required  functionality. 

Expandability 

— Ease  of  maintaining  the  software  to  meet  new  functional  or  performance 
requirements. 

Flexibility 

— Ease  of  maintaining  the  software  to  work  in  environments  other  than 
originally  required. 

Integrity — 

— Security  against  unauthorized  access  to  programs  and  data. 

Interoperability 

— Ease  of  coupling  the  software  w\th  software  in  other  systems  or  applica- 
tions. 

Maintainability 

— Ease  of  finding  and  fixing  errors. 

Manageability 

— Ease  of  administrating  development,  maintenance  and  operation  of  the 
software. 

Portability* 

— Ease  of  maintaining  the  software  * > execute  on  a processor  or  operating 
system  other  than  that  originally  required. 

Usability— 

— Ease  of  learning  <St  using  the  software,  and  of  preparing  input  & interpret- 
ing output. 

Reliability 

— The  rate  of  failures  in  the  software  that  render  it  unusable. 

Reusability 

— Suitability  of  software  modules  for  use  in  other  applications. 

Safety 

— Protection  against  loss  of  life  or  liability  or  damage  to  property. 

Survivability 

— Continuity  of  reliable  execution  in  the  presence  of  a system  failure. 

Verifiability  (testability) 

— Ease  of  verification  of  functionality  against  requirements. 

Table  1 - Quality  Factors 

selection.  Note  that  there  is  also  a reverse-i.iatrix  of  quality  factors  (not  shown)  that  tend  to  suppon  one  another, 
such  as  testability  and  maintainability  — similar  *sts  of  criteria  suppon  both  factors. 

So  you  get  the  idea  of  defining  quality  needs  for  specific  applications.  As  this  process  of  definition  continues,  a 
profile  begins  to  emerge  that  descnbes  the  proposed  software  in  terms  of  weighted  quality  factors. 


I introduce  this  term  to  describe  the  prioritized,  weighted  list  of  quality  factors  that  the  user  Sl  developer  define  for 
their  software  development  effon.  The  Quality  Profile  is  a "signature"  or  "fingerprint"  of  a project’s  quality  needs. 
Humphrey  [10)  offers  a common-sense  example  of  what  kinds  of  factors  are  important  for  different  applications, 
based  upon  the  "primary  concern"  of  the  application. 


iddfridivM.es* 


a. 

Effect  on  human  lives 

Reliability.  Correctness.  Testability 

b. 

Long  life  Cycle 

Maintainability,  Flexibility,  Portability 

r 

i 

c. 

Real  time  application 

Efficiency,  Reliability.  Correctness 

* 

d. 

In-house  tool 

Efficiency.  Reliability.  Correctness 

e. 

Classified  Informauon 

Integrity 

f. 

Communicating  systems 

Interoperability 

The  High  Pnomy  Quality  Factors  shown  for  each  type  of  application  begin  to  define  that  application's  quality  profile. 
The  profile  of  an  applicauon  of  type  "a"  is  given  by  high  degrees  of  reliability,  correctness  and  test abddy.  zr.d  io*er 


Arend  1990a 


M.  Artnd 

McOooncil  Douglas  „ 
Pag«3oM!  J 


A Method  for  Tailoring  tha  information  Contant  of  a Software  Procasa  Modal 


degrees  of  the  remaining  factors.  In  practice,  we  define  a more  precise  scale  of  degrees  and  assign  a particulars'  ;h* 
to  each  factor.  The  resultant  set  of  quality  factor  weights  defines  the  quality  profile  for  the  proposed  soft*-, re. 

Another  example,  more  generic,  is  given  by  Deutsch  [5]  to  suggest  an  initial  prioritization  of  Quality  Factor  oy 
"software  category”. 


a. 

Critical 

Safety,  Survivability,  Correctness,  Maintainability,  Efficiency 

b. 

Support 

Maintainability,  Verifiability,  Interoperability,  Portability,  Usability,  Correctness 

* 

c. 

I/O 

Correctness,  Interoperability,  Maintainability 

d. 

Data 

Interoperability,  Portability,  Reusability 

e. 

Computational 

Correctness.  Maintainability 

~ 

f. 

Environment 

Maintainability,  Verifiability,  Correctness,  Interoperability.  Portability,  Reusabil- 
ity, Efficiency,  Integrity 

• 

g- 

MMI 

Integrity,  Usability 

h. 

Documentation 

Correctness,  Maintainability 

• 

i. 

Design 

Expandability,  Flexibility,  Interoperability,  Maintainability.  Portability,  Reusabil- 
ity, Verifiability 

•* 

These  two  examples  offer 
multiple  concerns  or  cover 
specific  application. 


starting  points  for  the  development  of  a Quality  Profile.  Many  applicatx>ns  will  exhibit 
several  categories.  It  is  the  job  of  the  user  <fc  developer  to  define  the  Quality  Profile  for  the 


Deutsch  [5]  suggests  a metric  for  ranking  or  weighting  quality  factors. 

Lev* l nf  quality  required  _ Whtu  techniques  should  be  used  10  ensure  a quality  factor  of  this  mnk 
E Excellent  Exceptional  techniques 

G Good  Better  than  average  techniques 

A Average  Normal  corporate  practices 

NI  Not  an  Issue  No  special  techniques 

He  then  extends  the  metric  into  the  realm  of  cost  and  schedule  prediction,  using  Jensen  and  COCOMO  modes 
relative  cost  and  relative  schedule  analysis  factors.  Cost  and  schedule  prediction  will  not  be  pursued  further  here. 

Latter  day  SQA  is  also  developing  standardized  means  by  which  the  user  and  developer  discuss  and  come  -:a 
agreement  of  these  factors  for  each  application.  These  means  often  take  the  form  of  questionnaires  that  promp. 
user  to  evaluate  all  needs  for  quality. 


This  is  a common  SQA  term.  Quality  Criteria  are  detailed  subcharacteristics  which  the  software  exhibits  that  reflect 
the  degree  to  which  the  Quality  Factors  are  present.  In  other  words,  the  planned  presence  of  high— level  qualirr 
factors  implies  the  presence  of  a detailed  set  of  quality  criteria. 

The  Quality  Factors  are  user-oriented;  they  are  designed  to  map  easily  to  a user's  needs  for  the  proposed  software, 
The  Quality  Criteria  are  more  software-oriented;  they  are  designed  to  map  easily  to  characteristics  that  may  be 
evaluated  by  direct  testing  of  the  software.  The  relationship  between  quality  factors  and  quality  criteria  is  analogous 
to  that  between  the  two  common  stages  of  requirements  definition.  The  analogy  does  not  apply  to  the  amount  of 
effort  needed  to  go  from  the  early  phase  to  the  later  — Quality  Factors  may  be  translated  immediately  to  Quality 
Criteria.  Table  2 shows  a list  of  Quality  Criteria  [5],  [21]. 


There  is  a direct  translation  from  each  Quality  Factor  to  a subset  of  Quality  Criteria  which  support  the  factor.  The 
sets  of  criteria  that  support  different  factors  ’iy  be  disjoint  or  may  intersect.  Some  criteria  exhibit  conflicts  similar  to 
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Quality  Criterion  _ Ktoanlnw  nf  tritrrian  in  cnnttxt  of  software  nrnduct 

Accuracy Achievement  of  required  precision  in  calculations  and  outputs 

Anomaly  Mgrat Behavior  for  recovery  from  failures 

Augmenttbility Maintenance  effort  required  to  expand  upon  functions  and  data 

Autonomy —Degree  of  decoupling  from  execution  environment 

Commonality — —Use  of  standards  to  match  "look  and  feel"  of  similar  applications 

Communicativeness- Appropriateness  of  inputs  and  outputs 

Completeness Degree  to  which  all  software  is  necessary  and  sufficient 

Conciseness Amount  of  code  used  to  implement  algorithm 

Consistency Use  of  standards  to  achieve  uniformity  within  software 

Distributivity Physical  (device)  separation  of  function  and  data  (addresses  backup) 

Document  Quality Access  to  complete,  understandable  information 

Communication  Efficiency-Usage  of  communication  resources 

Processing  Efficiency Usage  of  processing  resources 

Storage  Efficiency Usage  of  storage  resources 

Functional  Scope Range  of  applicability  of  software  product's  functions 

Generality Range  of  applicability  of  software’s  internal  units 

Independence Degree  of  decoupling  from  support  environment 

Instrumentation Amount  of  code  devoted  to  usage  measurement  or  error  identification 

Modularity Cohesion  Coupling  of  software’s  modules  (design  & code) 

Operability Ease  of  operating  the  software 

Safety  Management Degree  to  which  the  design  addresses  hazard  avoidance 

Self-Descriptiveness- Undentandabiiity  of  design  & code 

Simplicity  ———  Degree  to  which  algorithms  map  to  the  problem  they  solve 

Support —Functionality  that  addresses  the  administration  of  maintenance 

System  Accessibility-  Controlled,  access  to  functions,  data  and  instructions 

System  Compatibility Use  of  standards  to  match  interfaces  with  hardware  Sc  communications 

Traceability Ease  of  finding  links  between  requirements,  design  and  code 

Training Provisions  to  help  users  leam  the  operation  of  the  software 

Virtuality Separation  of  logical  implementation  from  physical  component 

Visibility Objectivity  of  evidence  of  correct  functioning  — ease  of  test  verification 


Table  2 - Quality  Criteria 


those  examined  for  quality  factors.  Table  3 shows  a translation  between  Quality  Factors  and  Quality  Cntena  that 
shows  how  the  criteria  support  and  influence  the  factors,  either  positively  or  negatively.  The  traditional  direction  ot 
translation  is  from  criteria  to  factor  — the  SQA  or  test  team  measures  the  criteria  from  the  software,  and  reports  on 
what  quality  factors  the  software  thus  exhibits.  Our  method  will  begin  with  the  user  definition  of  quality  factors,  and 
develop  a set  of  criteria  that  the  software  must  meet  in  order  to  satisfy  our  quality  needs. 

This  table  is  merged  from  two  different  authors’  approach  to  the  factor/criteria  map  [S],  [21].  Their  perspecuves 
overlap  to  a high  degree,  but  each  one  shows  a few  more,  different  criteria  than  the  other.  I have  included  them  ail 
here  in  order  to  work  with  the  most  complete  universe  of  factors  and  criteria  possible.  Detailed  examination  of  the 
authors’  text  reveals  that  while  some  factors  and  criteria  sound  very  similar,  they  actually  do  describe  different 
characteristics  of  the  software. 
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Quality  Factors 
Quality  Criteria 
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Table  3 - Quality  Factors  <=>  Quality  Criteria  Map 


Symbols  are  used  in  the  cells  of  the  matrix  in  Table  3 to  indicate  the  influence  a criterion  has  on  various  factors. 
Another  viewpoint  is  that  they  indicate  which  criteria  are  necessary  to  support  each  factor.  A plus  under  a factor 
means  that  the  software  should  be  required  to  exhibit  the  corresponding  cnterion,  but  is  subject  to  trade-off  based 
on  any  conflicts  that  arise.  A double  plus  means  that  the  criterion  is  more  important,  and  less  subject  to  trade-off.  A 
negative  under  a factor  means  that  it  would  be  wise  not  to  require  the  software  to  exhibit  the  corresponding  cnterion. 
but  is  subject  to  trade-off  based  on  the  influence  of  other  factors.  A double  negative  means  that  extra  effort  must  be 
expended  to  require  the  software  to  exhibit  the  corresponding  criterion. 
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The  assignment  of  pluses  and  minuses  is  a subjective  process*  but  the  concept  has  been  refined  over  time  by  varaxa 
authors  [5],  [8],  [10],  [21]. 


SOFTWARE  PROCESS  MODELS 

"The  software  process  is  the  technical  and  management  framework  established  for  applying  tools,  methods  and 
people  to  the  software  task”  [10]. 

There  are  a handful  of  well-defined  "process  models"  or  "life-cycles"  in  the  industry  today.  They  each  describe  a 
set  of  activities  and  products  designed  to  support  the  successful  creation  of  a software  product.  The  most  widely  used 
model  is  called  the  Waterfall  model.  Other  models  are  coming  into  use  that  attempt  to  address  the  shortcomings  of 
the  Waterfall,  but  they  tend  to  generate  very  similar  information  products.  Appendix  D offers  a brief  description  of 
other  common  process  models. 

The  Waterfall  model  is  characterized  by  a linear  set  of  activities  and  products  such  that  each  acuvity  uses  the  output 
of  previous  activities  as  its  input.  Here  we  list  general  names  of  the  primary  technical  products  of  a waterfall  mocel. 

Activity  ( phase) Maiou2caduct:  generated  by  acmiiy  tehasc.) 

Concept  Definition Feasibility  Study,  Concept  document 

User  Req.  Definition Level-A  Requirements  Document,  Software  Management  Plan.  System  Interface 

Control  Document  (ICD) 

System  Req.  Definition Level- B Requirements  Document,  Subsystem  ICDs 

System  Design System  Design  Document.  System  Test  Plan 

Implementation  Software,  Test  Case  Document 

Testing Test  Report 

Maintenance Upgraded  Software,  Maintenance  Report 

Note  that  the  waterfall  model  itself  does  not  really  define  details  of  the  information  products  that  are  to  be  producid. 
Most  users  of  the  waterfall  model  recommend  a larger  set  of  documentauon;  these  recommendations  are  usually  laid 
out  in  a documentation  standard. 

SOFTWARE  DOCUMENTATION  STANDARDS 

A Documentation  Standard  defines  all  information  products  that  may  be  generated  to  support  development  of  me 
software  product.  Usually,  a documentation  standard  is  packaged  with  a life-cycle  standard.  Two  common  scandanis 
are: 

SMAP  Information  System  Life  Cycle  & Documentation  Standards  [15] 

DOD-STD-2167A  [6] 

For  this  study,  we  will  use  the  document  set  defined  by  NASA’s  Information  System  Life  Cycle  Documentation 
Standard  — Appendix  A shows  the  complete  list.  Our  tailoring  method  will  address  which  of  these  products  are 
most  important  for  a given  set  of  quality  factors. 

ANALYSIS  & DESIGN  METHODOLOGIES 

Within  the  framework  of  the  software  process  model,  some  method  must  be  used  to  define  the  content  of  each 
product.  Formalized  methodologies  address  the  complex  definition  of  the  requirements  and  design  products  of  the 
software  process.  There  are  many  different  methodologies  to  choose  from  for  use  within  any  software  process.  The 
information  content  of  the  requirements  document,  then,  may  vary  according  to  the  technique  used  to  produce  .t. 

For  example,  one  may  choose  to  specify  system  requirements  using: 
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a.  a simple  textual  notation  developed  in  an  ad  hoc  manner,  or  from  lessons  learned  during  prototyping. 

b.  a functional  decomposition  hierarchy  of  diagrams,  capturing  the  requirements  in  processes  and  data  flows. 

c.  an  information  model,  capturing  the  requirements  in  objects,  relations  and  behavior  diagrams. 

d.  a viewpoint/behavior  model,  capcuring  requirements  in  data/action  maps  and  state  diagrams. 

e.  a hybrid  of  the  above  techniques,  or  other  techniques. 

Appendix  C gives  a brief  overview  of  some  of  the  more  popular  methodologies  in  use  today,  and  lists  all  the  specific 
products  they  offer.  Our  tailoring  method  may  eventually  be  used  to  select  a meaningful  subset  of  these  products;  the 
current  version  of  the  paper  will  not  explore  this. 


TAILORING  INFORMATION  PRODUCTS 

The  hierarchy  of  S MAP- recommended  information  products  for  the  software  development  effort  is  shown  in 
Figure  1. 


Software  Process  Model 


Concept  Phase 
-Activities 

-Information  Products 


Requirements  Phase 
-Activities 

-Information  Products 


-Management  plan 
-Acquisition  plan 
-RFP 
-WBS 

-Dev.  contract 
-Config  Mgmt  plan 
-Risk  mgmt  plan 
-Assurance  plan 
-Concept  spec 
-Assurance  specs 
-Lessons  learned  doc 
-Assurance  reports 
-Phase  transition  re- 
view reports 


•Development  plan 
■Test  plan 
■I V«fcV  plan 
•SE&O  plan 
•Requirements  spec 
•Interfaces 
■User's  guide 
■Acceptance  test  doc 
•Discrepancy  reports 
-Eng.  change  proposals 


Design  Phase 
-Activities 

-Information  Products 

-Eng  <fc  Integ  plan 
-Support  plan 
-Architectural  spec 
-Detailed  spec  - 
-Integration  test  ___ 
-Prototyping  reports 


Implementation  Phase 
-Activities 

-Information  Products 

-Software  components 
-Maintenance  manual 
-Unit  test  document 
-Unit  test  reports 
^-Customer  inspect  report 


It  is  the  content  of  these  documents  that  is  addressed  by  the  various 
software  development  methodologies.  The  tailoring  method  will  also 
address  recommendations  for  the  contents  of  these  documents. 


Figure  I - SMAP  Information  Product  Overview 

Each  Information  Pn>:;uct  shown  will  be  analyzed  to  determine  which  quality  criteria  it  best  supports.  The  same 
analysis  will  be  applied  to  the  information  products  generated  by  various  development  methodologies.  At  this  point, 
we  will  be  ready  to  translate  a set  of  15  user  defined  Quality  Factors  sr.:o  a recommended  set  of  iniermation  prod- 
ucts. 

Tailoring  will  proceed  on  three  levels: 

1.  A subset  of  the  document  universe  will  be  selected  for  the  specific  quality  profile.  Example:  recommend 
producing  a Software  Recuu-ements  Spec,  among  other  documents. 

2.  For  each  selected  information  product,  a subset  of  it’s  maximum  table  of  contents  will  be  selected.  Example: 
recommend  defining  a Data  Definition  section  in  the  Software  Requirements  Spec,  among  other  secuons. 

3.  For  each  recommendation  from  the  table  of  contents,  a set  of  suggestions  will  be  given  to  characterize  the 
nature  of  the  information  that  should  appear  therein.  Example:  make  the  following  recommendations  for 
the  contents  of  the  Data  Definition  section:  minimize  the  number  of  different  data  representations,  mini- 
mize number  of  data  conversions,  use  dynamic  memory  allocation,  pack  all  data  items,  etc. 

The  user/developer  then  examine  the  lists  of  recommendations,  and  decide  whether  they  make  sense  in  the  context 
of  the  project.  There  may  still  be  some  manual  tailoring  to  do,  but  the  bulk  of  the  job  will  have  been  performed  by 
this  method. 
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FUTURE  WORK 

The  length  of  this  study  was  not  great  enough  to  develop  the  full  translation  from  Quality  Criteria  to  Information 
Products.  As  a starting  point*  the  requirements  volume  contents  in  Appendix  B have  been  mapped  to  quality  crite- 
ria. Areas  that  need  more  work  are: 

1 . Develop  the  complete  translation  between  Quality  Criteria  and  all  information  products  listed  in  the  Appen- 
dices. This  will  include  not  only  the  selection  of  specific  products,  but  recommendations  for  the  character  od 
that  product's  content. 

2.  Extend  the  tailoring  method  to  include  the  tailoring  of  Management  and  Assurance  activity  products,  as  wedl 
as  technical  development  products. 

3.  Define  a weighting  scheme  for  ranking  Quality  Factors  that  is  consistent  with  Software  Process  Model  and 
Design  Methodology  characteristics. 

4.  Analyze  the  list  of  information  products  generated  by  the  outstanding  process  models  in  use  today,  and 
annotate  with  descriptions  of  the  information  content  of  each  product.  These  descriptions  should  be  com- 
patible with  the  weighting  scheme  defined  in  area  3. 


Appendix  A 

LIFE  CYCLE  PHASES  & INFORMATION  PRODUCTS: 

NASA’S  SOFTWARE  ACQUISITION  STANDARD 

This  appendix  lists  the  life  cycle  phases  and  information  products  for  NASA's  Software  Acquisition  Life  Cycle  is 
defined  by  the  agency's  Software  Management  and  Assurance  Program  (SMAP).  This  set  of  documentation  wall 
serve  as  the  universe  from  which  a tailored  set  will  be  extracted. 

The  SMAP  plan  for  volume  roll-cut  describes  a mechanism  which  allows  the  manager/developer  to  create  informa- 
tion products  as  sections  of  one  volume,  or  as  separate  individual  volumes,  or  as  a combination,  depending  upon  t ne 
required  complexity  and  management  of  the  particular  information  product.  The  tailoring  method  will  select  a subset 
of  these  information  products  by  recommending  the  "complexity"  of  each  information  product.  It  is  recognized  that 
there  are  considerations  for  tailoring  other  than  the  quality  profile*  especially  as  apply  to  the  Management  Plan- 
Initial  tailoring  guidelines  will  focus  on  the  Product  Specification,  then  the  Assurance  Specification. 

Life  Cycle  Phases 

Concept  Definition  Phase  (CD) 


Requirements  Definition  Phase  (Req):  User  requirements.  System  Requirements 
Design  Phase:  Software  Architectural  Design  (SAD),  Software  Detailed  Design  (SDD) 
Implementation  Phase. (Imp!) 

Integration  and  Test  Phase:  Integration  <&  Unit  Test  (I AT).  Acceptance  Test  (AT) 
Maintenance,  or  Sustaining  Engineering  & Operations  (SE&O) 

Information  Products:  Data  Item  Descriotions  (DID) 

inrludinf  updater 

Management  Activity  Products:  the  Management  Plan 

Product 

Phase  fs)  durint  which  product  is  generated. 

Comnonent  Management  Plan 

CD  I AT 

SFAO 

Component  Acquisition  Plan 

CD 

Request  for  Proposal 

CD 

Work  Breakdown  Structure 

CD 

Software  Develonment  Contract 

CD 
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Appendix  B 

INFORMATION  CONTENT  of  the  NASA-SMAP  STAN* 

DARD  SOFTWARE  PRODUCT  SPECIFICATION 

( This  appendix  lists  the  full  table  of  contents  for  SMAP's  Software  Product  Specification  (SM  AP-DfD-POOO-SW) . 
This  document  package  contains  a Software  Concept  Document,  a Software  Requirements  Spec,  a Software  Archi- 
t tectural  Design  Spec,  a Software  Detailed  Design  Spec,  a delivery  Version  Description,  a User's  Manual  and  a 

Maintenance  Manual,  (from  [ 15]) . The  contents  have  been  extended  to  include  a more  complete  list  of  information 
items  that  may  be  useful  (from  [1]).  The  extended  items  are  italicized. 

An  initial  pass  at  mapping  document  sections  to  quality  criteria  has  been  performed  for  the  Requirements  Volume  — 
the  map  uses  abbreviations  shown  in  the  key  below,  and  should  be  read  "backwards"  for  each  criterion.  In  other 
words,  the  map  is  to  be  used  by  selecting  those  document  sections  that  show  a reference  to  each  criterion  that  is 
l specified  by  the  quality  profile. 


Ac: 

Accuracy 

DQ: 

Document  Quality 

Sf: 

Safety  Management 

AM: 

Anomaly  Mgmt 

EC: 

Communication  Efficiency 

Sd: 

Self-descriptive  ness 

Ag: 

Augmentability 

EP: 

Processing  Efficiency 

Sm: 

Simplicity 

At: 

Autonomy 

ES: 

Storage  Efficiency 

Sp: 

Support 

Cm: 

Commonality 

FS: 

Functional  Scope 

SA: 

System  Accessibility 

Cc: 

Communicativeness 

Gn: 

Generality 

SC: 

System  Compatibility 

Cp: 

Completeness 

Ip: 

Independence 

Tc: 

Traceability 

Cn: 

Conciseness 

Is: 

Instrumentation 

Tr 

Training 

Cs: 

Consistency 

Md: 

Modularity 

Vn 

Virtuality 

Ds: 

Distributivity 

Op: 

Operability 

Vs: 

Visibility 

Key:  Quality  Criteria  Abbreviations 


The  Introduction  and  Related  Documentation  sections  are  recommended  in  their  entirety  for  every  software  devel- 
opment effort.  Content  of  the  volumes  following  will  be  addressed  by  the  tailoring  method.  (At  present,  only  the 
Requirements  Volume  is  addressed). 


; Introduction 
i Identification  of  Volume 

Scope  of  Volume 

. Purpose  and  Objectives  of  Volume 

j Volume  Status  and  Schedule 

Volume  Organization  and  Roll-Out 

Related  Documentation 
Parent  Documents 
Applicable  Documents 
Information  Documents 

<* 

| Concept  Volume 

Definition  of  Software 
Purpose  and  Scope 
Goals  and  Objectives 
Description 
Policies 

[ ' Anticipated  Uses  of  System 

’ Optional  Configurations 

User  Definition 
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Overview  of  tha  User  Organization 
Logical  organization 
Physical  organization 
Temporal  organization 
reporting  cycles 
scheduled  events 
Information  flow  organization 
Capabilities  and  Characteristics 
Sample  Operational  Scenarios 
Anticipated  Operational  Strategy 
System  ownership 
System  administration 
operational  control 
modification  policy 
change  support 
User  administration 
departments 
skill  level : 

Funding  stre^z  jy 
Currently  Used  Procedures 

Requirements  Volume 

Requirements  Approach  and  Tradeoffs 

Design  Standards  to  be  used 

World  Model  (Information  model)  type  A 

Entity- Re  lot  ion  summary  (Data  Requirements) 

Entities:  description . attributes . class  size 
Attributes:  description . values . defaults,  constraints . 

class  size,  retention! archive  requirements 
Relationships:  description,  size,  components,  constraints 
Individuals  (instantiations  of  entities) 

World  Model  (Information  model)  type  B — 

Objects:  description . allowed  operations ^ class  size 
Allowed  Operations:  constructors , interrogators , 
iterators,  etc . 

Messages:  sent,  received 
External  Interface  Requirements-™-  - ■ 

Operational  Resources  A Resource  Limitations 
Requirements  Specification 

Process  and  Data  Requirements 

Function  Input  data  A Source 

Function  Transactions  and  Algorithms  - 

Function  Output  data  <fc  Destination 

Function  Triggering  mechanisms  A conditions 
Function  Termination  mechanisms  A conditions 

Function  Expected  demand 

Data  Definition  - 

Data  Relationships 

Data  Protection  requirements 

Data  Validity  check  requirements 

Data  Parameterization  requirements 

Data  Format  or  Implementation  Restrictions 
System  Behavior  Requirements 
Phases  A Modes 
System  Actions 


DQ,  Tc 

Cm,  Cs,  Md,  SC 
Ag.  Cc,  Md.  Sd.  Vr 


Ag.  Cc,  Md.  Sd.  Vr 


-Cc,  EC.  SC 
-EC,  EP.  ES.  Vr 


-Ac,  Ag,  AM,  Cc.  Cm,  Gn.  SC.  Sd,  Tc.  V$ 
-Ac.  Ag,  AM,  Cp,  Cs,  EP,  FS,  Gn.  Md 
-Ac,  Ag.  AM,  Cc.  Cm,  Gn,  SC.  Sd.  Tc,  Vs 
-AM.  Cm.  EP 
-AM.  Cm.  EP 
-EP 

-Ac,  Ag.  At 
-Ac,  Ag,  At 
-Op 

-Ac.  AM.  Gn,  Ip.  Op.  SA 
-Ac,  Ag.  Gn,  Sd.  Vr 
-Ac.  Ag.  At 

-Ac.  .Ag.  AM.  SI 
-Ag.  AM.  Cm.  Sf 
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Performance  and  Quality  Engineering  Requirements 
Timing  & Sizing  requirements- 


Sequencing  <fc  event  timing  requirements- 
Throughput  <fe  capacity  require ments- 


-EC,  EP,  ES 
-EC,  EP 
-EC.  EP 


Error  Detection,  Isolation,  Recovery  requirements — Ac,  AM,  Ds,  Is,  Sf 


Quality  Engineering  require ments- 
Quaiity  factors  required 
Safety  Requirements 
Security  and  Privacy  Requirements 
Access  requirements 

to  functions 

to  data 

to  code 


Legal  requirements - 
Audit  requirements - 


Other  policy-based  requirements 

Implementation  Constraints 

Site  Adaptation 

Design  Goals 

Human  Factors  Requirements 
User  type  definition 

level  of  computer  sophistication- 
technical  competence  required — 
Physical  constraints 
response  time- 


special  physical  limitations! requirements- 

On-line  help  requirements 

Robustness  requirements 

Failure  message  <fc  diagnostic  requirements — 
fnputlOutput  convenience  requirements 
defaults 
formats 

Traceability  to  Parent's  Design 

Partitioning  for  Phased  Delivery  — — 


-ALL 


-AM,  Sf.  SA 


-Cm,  Sf,  SA 
-Cm,  Sf,  SA 
-Sf.  SA 
-Sf 
-Vs 

-Ag,  Ds,  Ip 
-Ag,  At,  Gn 
-Cn,  Co,  Gn,  Sm 


-Op,  Cc 
-Op.  Cc 


-Cm,  Op 
-Cm.  Op 
-Op 

-AM.  Gn.  Sf.  SA 
-AM,  Cm,  Cc,  Gn,  Is,  Op 
-Cm,  Cc.  Is.  Op 


-Tc,  Sm 
-DQ.  Tc,  Vs 


Design  Volume 

Architectural  Design 

Design  Approach  and  Tradeoffs 
Architectural  Design  Description 
External  Interface  Design 
Requirements  Allocation  and  Traceability 
Partitioning  for  Incremental  Development 
Deoiled  Design 

Detailed  Design  Approach  and  Tradeoffs 
Detailed  Design  Description 
External  Interface  Detailed  Design 
Coding  and  Implementation  Notes 
Firmware  Support  Manual 


Version  Description  Volume 
Product  Description 
Inventory  and  Product 
Materials  Released 
Product  Content 
Change  Status 

Installed  Changes 
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Waivers 

Possible  Problems  and  Known  Errors 

User  Documentation  Volume 
User’s  Guide 

Overview  of  Purpose  and  Function 
installation  and  Initialization 
Startup  and  Termination 
Functions  and  their  Operation 
Error  and  Warning  Messages 
Recovery  Steps 
User’s  Training  Materials 

Maintenance  Manual  Volume 
Implementation  Details 
Modification  Aids 
Code  Adaptation 
Standards 

Abbreviations  and  Acronyv-i 

Glossary 

Notes 

Appendices 


Appendix  C 

DESIGN  METHODOLOGIES  and  their  INFORMA- 
TION PRODUCTS 

“Hus  appendix  lists  information  products  generated  by  the  more  popular  analysis  k design  methodologies  of  the  day 
(compiled  from  (3],  (9]).  These  products  make  up  a portion  of  the  contents  of  the  Software  Product  Spec  as  listed  in 
Appendix  A and  Appendix  B.  It  is  hoped  to  extend  the  tailoring  method  to  recommend  an  appropriate  set  of  design 
methodology  information  products  based  cn  the  quality  profile. 

Functional  Decomposition 

Structured  Design  fSP)  — Constantine/Mvers/Yourdon 

This  is  the  traditional  data  flow  diagram  methodology  that  has  been  in  use  since  the  early  seventies.  It’s  main 
products  are  a hierarchical  set  of  data  flow  diagrams,  process  specifications  and  a data  dictionary.  State  transi- 
tion diagrams  may  also  be  used  when  deemed  i\  pessary  by  the  analyst. 

Real  Time  Structured  Aasluil  & Design  (RTS ADI 

This  methodology  is  similar  to  SD.  but  includes  the  analysis  and  design  of  coxxrol  flow  between  processes.  State 
transition  diagrams,  decision  tables  and  process  activation  tables  are  used  with  more  regularity. 

Qbiect  Oriented  Deaim  rOODt 

OOP  — Booch 

The  objects  defined  in  Booch's  OOD  have  associated  attributes  and  allowed  operations.  They  use  the  concepts 
of  visibility,  class  and  inheritance,  and  they  communicate  with  each  other  via  message  passing.  One  of  Booch’s 
goals  in  designing  this  methodology  was  to  be  compatible  with  the  Ada  language,  and  the  objects  map  well  to  Ada 
constructs. 

GOOD  (General  OOP)  — Seidewitr 

The  objects  defined  in  this  OOD  have  associated  aitnbutes  only.  They  are  ved  to  one  another  not  by  message 
passing,  but  by  defined  relationships.  This  is  an  attempt  to  model  the  real  world  more  closely,  and  applies  well  to 
non-^al  time  application*,. 
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Oth*r  Methodologies 


J»rksnn  Structured  Dftrign  ( TSDl  — Jackin 

This  unique  approach  was  an  early  contender  on  the  requirements  modeling  scene,  and  is  still  going  strong.  As 
industry  has  developed  the  terms,  we  discover  that  JSD  is  a natural  hybrid  of  Object  Oriented  and  Functional 
Decomposition  methodologies.  JSD  has  its  own  set  of  information  products  which  do  not  match  100%  any  of  the 
traditional  products  in  the  map  below,  but  I show  what  traditional  products  are  most  like  those  produced  by  JSD. 
rather  than  specifying  and  defining  new  product  categories. 

Ada-based  Design  Annroach  for  Real  Tim*  Sv stems  TAP  ARTS)  — Gflmit 

This  methodology  is  an  Ada-based  version  of  DARTS;  it  builds  upon  the  SCR  module  structuring  criteria,  the 
Booch  object  structuring  criteria,  and  the  DARTS  task  structuring  criteria  to  generate  maintainable  and  reusable 
software  components.  It  offers  consideration  of  the  concurrent  nature  of  real-time  systems.  The  analysis  and 
design  diagrams  use  the  "Booch-gram"  Ada  notation. 

Software_Cost  Reduction  (SCR1  — Pam  as 


This  real-time  oriented  methodology  concentrates  on  the  modules  that  will  make  up  the  software  product,  an 
information-hiding  hierarchy  into  which  they  fall,  and  the  interfaces  which  they  use  among  themselves.  Without 
trying,  it  is  almost  object  oriented.  The  methodology  offers  strong  support  for  software  reuse. 

Software  Productivity  CQnsortium_MethQdQlQrv_fSPCMl_—  Comma 

This  methodology  is  based  on  SCR.  Its  primary  areas  of  focus  are  the  inclusion  of  rapid  prototyping  techniques 
and  the  production  of  reusable  software. 

Information  Products  of  the  Methodologies 
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Appendix  D 

OTHER  SOFTWARE  PROCESS  MODELS 

A sampling  of  Software  Process  Models  other  than  the  Waterfall  Model  are  briefly  described  here.  Recall  that  their 
associated  information  products  are  very  similar  to  those  described  in  Appendix  A. 
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Spiral 

A management  oriented  model.  Activities  and  products  are  almost  identical  to  those  of  the  waterfall  model,  but  are 
interspersed  with  regular  prototyping  and  risk  analyses  efforts  to  guide  the  process. 

Tfopiri  Pmtn typing 

This  prototyping  model  coven  the  requirements  definition  phases  of  the  waterfall  or  other  similar  model.  It  is  gener- 
ally recommended  for  never-before-attempted  solutions,  or  when  the  user  & developer  deem  areas  of  the  problem 
concept  to  be  technologically  difficult. 

A partial  implementation  of  the  system  is  constructed  from  informal  requirements,  usually  of  poorly  understood 
areas.  Users  exercise  of  the  prototype  to  better  understand  and  define  requirements.  The  prototype  must  then  be 
discarded,  and  system  design  is  begun  from  the  requirements. 

It  is  impoitant  to  avoid  temptations  to  Iteep  and  build  upon  the  prototype,  because  the  very  nature  of  rapid  prototyp- 
ing causes  generation  of  code  that  is  inefficient,  unsafe,  unreliable,  unmaintainable,  etc.  If,  during  development  of 
the  prototype,  algorithms  or  designs  are  discovered  that  are  particularly  efficient,  safe,  reliable,  maintainable,  etc. 
they  should  be  documented  for  consideration  during  the  "rear  design. 

Evolutionary  Prototyping 

This  prototyping  model  is  also  recommended  for  technologically  difficult  problems,  but  coven  a larger  area  of  the 
life  cycle.  It  is  hoped  that  the  evolutionary  prototyping  efforts  will  help  guide  and  speed  the  requirements  definition, 
system  design  and  implementation  phases. 

A partial  implementation  of  the  system  is  constructed  from  partially  known,  well  defined  requirements,  usually  of 
well  understood  areas.  Users  exercise  the  prototype  to  better  undersund  and  define  remaining  requirements.  The 
prototype  forms  a set  of  baseline  software  which  will  be  built  upon  to  complete  the  deliverable  versions.  At  this  point, 
the  model  may  transition  to  the  Iterative  Enhancement  model. 

Development  of  an  evolutionary  prototype  begins  with  well  defined  requirements.  It  takes  longer  than  rapid 
prototyping,  because  good  software  engineering  practices  must  be  used  to  develop  code  that  will  eventually  be  pan  of 
the  working  product. 

Iterative  Enhanc-nent  a.k,a.  Incremental  Development 

This  model  is  recommended  for  applications  that  have  a basic,  well  understood  core  set  of  functions.  The  model  is 
characterized  by  many  releases  of  new  versions  which  add  new  functionality.  Many  market- penetration  schemes  will 
use  this  model  to  get  a product  into  the  marketplace  and  generating  revenue,  to  pay  for  later  enhancements.  A rather 
complete  set  of  requirements  is  known  up  front,  and  the  releases  of  new  functions  are  planned  in  advance;  of  course, 
the  model  is  adaptable  to  new  requirements  and  relies  on  user  feedback  to  improve  the  product. 

Software  Reuse 

This  model  may  be  used  to  cover  the  design  portion  of  the  waterfall  or  other  similar  model.  It’s  design  paradigm 
relies  mostly  on  the  incorporation  of  previously  proven  designs  and  code  into  new  software  products. 

Automated  Software  Synthesis 

This  is  an  advanced  model  that  usually  requires  strict  formulation  of  require  me  ms  using  a regular  grammar  specifica- 
tion language.  This  model  offers  the  direct  (and  hopefully,  automatic)  transformation  of  requirements  and/or  high 
level  design  into  code,  either  algorithmically  or  u»*ng  a knowledge  based  rule  set.  It  is  hoped  to  eliminate  the  middle 
portions  of  the  documentation  set.  centering  around  the  detailed  design. 

CASE  tools  currently  exist  that  support  this  model  to  some  degree.  Typically,  they  will  generate  Ada  package  specs 
and  the  interface  portions  of  package  bodies  from  structure  charts. 
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DEFINITIONS 


• SOFTWARE  PROCESS  MODEL  (or  LIFE  CYCLE) 

v ’’The  technical  and  management  framework  established  for  ap- 
plying tools,  methods  and  people  to  the  software  task.” 

r Applies  to  the  entire  development  cycle  of  the  software,  from 
concept  to  maintenance. 

• SOFTWARE  METHODOLOGY 

v*  Definition  of  a means  for  capturing  requirements  and  design. 

^ Applies  to  one  or  more  portions  of  the  development  cycle,  usu- 
ally requirements  analysis,  specification  or  design. 


• TAILORING 

v*  Selecting  a subset  of  a Process  Model  or  a Methodology  for  prac- 
tical application. 

• SOFTWARE  QUALITY 

y The  degree  to  which  software  matches  customer/user  needs. 
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INTRODUCTION 


+ MANY  SOFTWARE  PROCESS  MODELS  AND  SOFTWARE 
METHODOLOGIES  RECOMMEND  TAILORING. 

t TAILORING  IS  USUALLY  GUIDED  BY  PERSONAL  EXPE- 
RIENCE, ABILLIY,  AND  TRADITION. 

* WE  WILL  DESCRIBE  A METHOD  FOR  TAILORING. 


CUSTOMER/ 

USER  NEEDS 

TAILORING  TAILORING 
METHOD  “P*  RECOMMENDATIONS 


SUBSET  OF  INFORMATION  PRODUCTS 


ALL  INFORMATION  PRODUCTS  OF 
A SOFTWARE  PROCESS  MODEL 
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CHARACTERIZING  CUSTOMER/USER  NEEDS 

* WE  WILL  USE  CONCEPTS  FROM  SOFTWARE  QUALITY 
ASSURANCE  (SQA)  TO  EXPLORE  CUSTOMER  NEEDS; 
v*  What  constitutes  appropriate  fitness  for  use  of  this  software? 
p0  What  attributes  must  this  software  exhibit  to  be  considered  of 

high  quality? 

v0  Remember,  software  quality  is  more  than  ’’goodness”,  it  is  a 
measure  of  how  well  the  software  matches  the  needs  of  the  cus- 
tomer and  user. 

* SQA  SHOWS  HOW  TO  OBJECTIFY  A QUALITY  RATING 
OF  SOFTWARE,  BY  EVALUATING  QUALITY  FACTORS. 

p0  Capture  Quality  Factors  through  Customer/User  interviews. 

+ SQA  SHOWS  HOW  TO  TRANSLATE  QUALITY  FACTORS 
TO  QUALITY  CRITERIA.  WHICH  ARE  MORE  DIRECTLY  RE- 
LATED TO  SOFTWARE  TESTABILITY. 
p0  Derive  Quality  Criteria  from  Quality  Factors 
p0  Derive  development  techniques  to  enforce  Quality  Criteria 
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THE  METHOD’S  STEPS 

1 . PERFORM  STANDARD  INTERVIEWS  AND  DIALOGS  BE- 
TWEEN DEVELOPER  AND  CUSTOMER/USER. 

2.  GENERATE  A PROFILE  OF  QUALITY  FACTORS  OF  THE 
SOFTWARE  TO  BE  DEVELOPED. 

3.  TRANSLATE  THIS  QUALITY-NEEDS  PROFILE  INTO  A 
SET  OF  QUALITY  CRITERIA  THAT  MUST  BE  MET  BY 
THE  SOFTWARE. 

4.  MAP  THE  CRITERIA  TO  A SET  OF  REQUIREMENT  AND 
DEVELOPMENT  TECHNIQUES. 

5.  SELECT  AND  TAILOR  THE  INFORMATION  PRODUCTS 
WHICH  MATCH  OR  SUPPORT  THOSE  TECHNIQUES. 

| 6.  SELECT  AND  TAILOR  DESIGN  METHODOLOGY(S)  TO 

* PRODUCE  THESE  INFORMATION  PRODUCTS. 
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THE  METHOD’S  STEPS 


1 
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Step  1 

PERFORM  STANDARD  INTERVIEWS  AND  DIALOGS  BETWEEN  DEVEL- 
OPER AND  CUSTOMER/ USER 

» QUESTIONNAIRES  DESIGNED  TO  PROBE  THE  USER’S 
NEEDS  FOR  QUALITY. 

t IMPORTANT  TO  DEFINE  BOUNDARY  OF  SPECIFICA- 
TION, TO  PREVENT  OVER-  OR  UNDER-SPECIFICATION 
OF  QUALITY  NEEDS. 

* DEVELOPER  WRITES  QUESTIONNAIRES,  USING  A 
GREAT  DEAL  OF  BOILERPLATE  AND  HELPS  CUS- 
TOMER/USER THROUGH  THE  PROCESS. 

* EXAMPLE  QUESTIONS 

||  v*  How  many  users  will  want  to  use  the  system  simultaneously? 

§>  v*  What  level  of  user  training  is  acceptable? 

* v*  Will  other  computer  systems  rely  on  this  one? 
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Step  2 

GENERATE  A PROFILE  OF  QUALITY  FACTORS  OF  THE  SOFTWARE  TO 

BE  DEVELOPED 

» QUANTIFY  RESPONSES  TO  USER  QUESTIONNAIRES. 

t THE  TAILORING  METHOD  DEFINES  A TRANSFORMA- 
TION BETWEEN  POSSIBLE  RESPONSES  AND  QUALITY 
FACTORS. 

* THE  TRANSFORMATION  WILL  APPLY  WEIGHTED  VAL- 
UES TO  EACH  RESPONSE,  BASED  UPON  THE  EFFECT 
THE  ISSUE  PROBED  BY  THE  QUESTION  HAS  UPON  ITS 
RELATED  FACTOR(S).  (Most  questions  will  deal  with  decisions 
that  influence  several  factors  to  varying  degrees,  even  positively  for 
some  and  at  the  same  time  negatively  for  others). 

t SINCE  SOME  FACTORS  CONFLICT  WITH  OTHERS,  A SEC- 
OND USER  INTERVIEW  MAY  BE  NECESSARY  TO  AM- 
PLIFY RELATIVE  IMPORTANCE.  Factor  conflict  may  assist 
risk  identification  and  management. 


(Ml  Koltwdie  Lnit)iitt!iiiig  Wuikuliup 

Novcinbci  28.  IV'JO 


PiftttoOl 


I 


A Method  for  Tailoring  the  Information  Content  of  a Software  Process  Model 


Step  3 

TRANSLATE  THE  QUALITY-NEEDS  PROFILE  INTO  A SET  OF  QUALITY 

CRITERIA  THAT  MUST  BE  MET  BY  THE  SOFTWARE 

* PRE-DEFINED  GUIDELINES  MAP  FACTORS  TO  CRITE- 
RIA. 

t THIS  TRANSLATION  BRINGS  US  CLOSER  TO  WHAT 
QUALITY  MEANS  IN  TERMS  OF  A SOFTWARE  PROD- 
UCT, RATHER  THAN  IN  TERMS  OF  THE  USER. 

t SOME  CRITERIA  ALSO  CONFLICT  WITH  ONE  AN- 
OTHER. THIS  TRANSLATION  WILL  ASSIGN  RELATIVE 
WEIGHTS  TO  THE  CRITERIA  TO  HELP  REDUCE  CON- 
FLICTS. 

t REMEMBER,  CONFLICTS  ARE  NOT  IMPOSSIBILITIES, 
THEY  MERELY  IDENTIFY  AREAS  REQUIRING  EXTRA 
EFFORT  AND  EXCEPTIONAL  TECHNIQUES  - RISK  MAN- 
AGEMENT. 
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Step  4 

MAP  THE  CRITERIA  TO  A SET  OF  REQUIREMENT  AND  DEVELOP- 
MENT TECHNIQUES 

+ TECHNIQUES  OF  DEVELOPMENT  AND  MANAGEMENT 

MAY  BE  USED  TO  ENSURE  THE  PRESENCE  OF  VARIOUS 
QUALITY  CRITERIA. 

+ TYPES  OF  TECHNIQUES 
v*  Product  Recommendation 
p*  Method  Recommendation 
p*  Standards  Recommendation 
p0  General  Guidelines 

+ EXAMPLES 

p*  Produce  a traceability  matrix  to  ensure  Completeness. 
p0  Use  prototyping  to  ensure  Usability, 
p*  Adhere  to  interface  standards  to  ensure  Commonality, 
p - Separate  critical  & non-critical  functions  to  ensure  Safety  Man- 
agement. 
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Step  5 

SELECT  AND  TAILOR  THE  INFORMATION  PRODUCTS  WHICH  MATCH 

OR  SUPPORT  THE.  TECHNIQUES 

» INFORMATION  PRODUCTS  ACT  AS  SPECIFIC  GOALS 
WHICH  FORCE  US  TO  RECOGNIZE,  FORMALIZE  AND 
ADHERE  TO  TECHNIQUES  TO  SPECIFY,  DESIGN  AND 
IMPLEMENT  SOFTWARE  OF  APPROPRIATE  QUALITY. 

t INFORMATION  PRODUCTS  DOCUMENT  REQUIRE- 
MENTS AND  DESIGNS,  PROVIDING  FOR  CONTINUITY 
OF  DEVELOPMENT  AND  MAINTENANCE. 

t WE  WISH  TO  SELECT  THE  APPROPRIATE  SUBSET  OF 
ALL  POSSIBLE  INFORMATION  PRODUCTS. 

* THE  TAILORING  METHOD  WILL  DESCRIBE  A UNI- 
VERSE OF  INFORMATION  PRODUCTS,  AND  WILL  OF- 
FER A DIRECT  TRANSLATION  FROM  QUALITY  CRITE- 
RIA TO  RECOMMENDED  SUBSET  OF  THAT  UNIVERSE. 


P 


tiiil't1  Soliwtiiv  Liigincoimg  Wuikiluip 
r 

% 


10  - 


Mmk  AiemJ 
*■ 

* 


4, 


i 


Pa*e  29  of  31 


k 


A Method  tor  Tailoring  the  Information  Content  of  a Software  Process  Model 


r- 


Step  6 

SELECT  AND  TAILOR  THE  DESIGN  METHODOLOGY  WHICH  PRO- 
DUCES THESE  INFORMATION  PRODUCTS 

» MANY  METHODOLOGIES  ARE  AVAILABLE  FOR  SOFT- 
WARE REQUIREMENTS  SPECIFICATION,  SOFTWARE 
DESIGN  AND  IMPLEMENTATION. 

t THE  TAILORING  METHOD  WILL  DESCRIBE  A UNI- 
VERSE OF  METHODOLOGIES,  AND  WILL  CATEGORIZE 
THEM  BY  THE  INFORMATION  PRODUCTS  THEY  PRO- 
DUCE. 

» THE  MATCHUP  BETWEEN  INFORMATION  PRODUCTS 
PRODUCED  BY  A METHODOLOGY  AND  THOSE  RECOM- 
MENDED TO  ACHIEVE  THE  QUALITY  PROFILE  FACILI- 
gj  TATES  THE  SELECTION  OF  AN  APPROPRIATE  METH- 

1 1 ODOLOGY. 
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CURRENT  STATUS  AND  FUTURE  WORK 

+ THIS  PHASE  OF  THE  RESEARCH  EFFORT  DEALT  WITH 
DISCOVERY  OF  CONCEPTS  AND  ASSEMBLY  OF  DATA. 

+ AREAS  ALREADY  DEVELOPED  TO  SOME  EXTENT 
^ Translation  from  Quality  Profile  to  Quality  Criteria 
^ List  of  Techniques  sorted  by  Quality  Criteria 

^ Universe  of  Information  Products  (enhanced  NASA  SMAP 
standard) 

^ Universe  of  Methodologies 

+ AREAS  FOR  DEVELOPMENT 
^ User  Questionnaire  boilerplates 
^ Response  weighting  scheme 

^ Transformation  of  weighted  responses  to  Quality  Profile 
^ List  of  Information  Products  sorted  by  Quality  Criteria 
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Managing  software  development  in  large  organizations  has  become  increasingly  cfifficutt 
due  to  constantly  increasing  technical  complexity,  stricter  government  standards,  a 
shortage  of  experienced  software  engineers,  competitive  pressure  for  improved 
productivity  and  quality,  the  need  to  co-develop  hardware  and  software  together,  and  tie 
rapid  changes  in  both  hardware  and  software  technology. 

The  "software  factory*  approach  to  software  development  minimizes  risks  while 
maximizing  productivity  and  quality  through  standardization,  automation,  and  training. 
However,  in  practice,  this  approach  is  relatively  inflexible  when  adopting  new  software 
technologies.  How  can  a large  multi-project  software  engineering  organization  increase 
the  likelihood  of  successful  software  technology  Insertion  (STJ).  especially  in  a 
standardized  engineering  environment? 

HISTOGRAM  of  SOFTWARE  TECHNOLOGY  INSERTION  CASES 
8 

6 

4 

2 

0 

-12  -6  0 +6  +12 
Failure  < ■ ► Success 

Raurel  -Distribution  of  scores  from  49  rated  STI  Cases 

In  an  attempt  to  correlate  various  success  factors  with  levels  of  success,  59  cases  of  "new 
software  technology  insertion*  in  thirteen  recent  projects  at  a large  U.S.  Defense 
electronics  contractor  were  identified  and  categorized  according  to  several  criteria.  The 
relative  success  or  failure  of  49  of  these  cases  (see  Figure  1)  was  determined  by 
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having  key  project  personnel  (Lead  Engineer,  Dept  Manager,  and  tool  supporters)  rate  6 
aspects  (added  together  for  total  rating)  of  the  software  technology  insertion  results. 
Maximum  success  was  scored  as  +12,  and  maximum  failure  as  *12  on  the  rating  scale. 
The  histogram  in  Figure  1 illustrates  the  dstribution  of  scores  from  the  49  rated  cases. 

There  were  21  Afferent  new  software  technologies  studied,  most  of  them  new  tools 
or  methods,  including  (in  approximate  lifecycle  order): 

• The  use  of  DoD-STD-2167  or  2167A 

• Structured  analysis  CASE  tools 

• Ripid-Prototyping  in  requirements  or  design 

• In-House  requirements  traceability  tool 

• In-House  program  design  language  (PDL)  tools 

• Reusable  Software  in  design  or  coding 

• The  use  of  Ada®  as  an  implementation  language 

• The  use  of  M68C20  assembly  language 

• Microprocessor  Development  Stations  (MDS)  for  integration  testing 

• In-House  configuration  management  (CM),  source  code  control  tool 

• Workstation-based  engineering  documentation  tods 

• The  use  of  workstations  as  primary  development  platforms 

Though  meaningful  statistical  correlations  were  not  possible  due  to  the  limited  sample 
size,  ratings  were  compiled  and  empirically  compared  with  several  technology  factors 
measured  for  each  STI  case,  induding: 

• Technology  Type  (Competence-Enhandng  or  -Destroying) 

• Support  Type  (In-House  or  External) 

• Maturity  of  the  Technology  (Young,  Mature,  and  Old) 

• Project  Size  (SLOC) 

• Prior  Expectations  (for  success  or  failure) 

• Reasons  (for  using  the  new  software  technology) 

• Methods  (of  inserting  the  new  software  technology) 

• Perceived  Time  Savings 
• Perceived  Labor  Savings 

• Perceived  Computer  Cost  Savings 

• Perceived  Quality  Improvement 

• Met  Expectations?  (for  success  or  failure) 

A doser  look  at  the  Top  Eleven*  cases  of  successful  STI  (ratings  2 +7),  and  the 
‘Bottom  Seven*  cases  of  unsuccessful  STI  (ratings  £ -7)  shows  that: 

1.  Perceived  Time  Savings  and  perceived  Labor  Savings  are  the 
most  significant  real  indicators  of  successful  or  unsuccessful  STI. 

2.  Though  users  often  complain  about  increased  computer  costs, 
saving  computer  cost  is  not  an  indicator  of  STI  success.  because  it 
is  not  usually  a goal  or  a motivator  for  the  use  of  new  technology. 

3.  Perceived  Quality  Improvement  is  a strong  indcator  of  STI 
success,  but  not  an  indcator  of  STI  failure. 
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4.  Even  in  successful  STI  cases,  users'  Prior  Expectations  about 
what  a new  technology  can/cannot  do  are  not  managed  affectively 

In  addition  to  the  success  ratings,  on-site  structured  Interviews  were  used  to  profile 
each  new  technology,  and  collect  other  qualitative  information  that  was  used  to  clarify 
and  complete  the  data. 

Tushmanfi]  describes  new  technology  types  as:  (1)  competence  enhancing  - 
incrementally  different  buildng  on  existing  know-how,  and  substituting  for  older 
tedmologies  without  rendering  their  skills  obsolete,  or  (2)  competence-destroying  - 
fundamentally  Afferent  requiring  new  skills,  abilities,  and  knowledge  for  use.  The  main 
types  of  technology  support  are:  (1 ) In-House,  where  the  supporters  work  in  the  same 
organization  as  the  users,  or  (2)  Outside,  where  the  supporters  work  in  a different 
organization  than  the  users. 

A sanpie  of  the  dstribution  of  successful  STI  cases  over  these  two  combined  factors 
(technology  type  and  support  type)  is  show  in  Figure  2: 

Ratings  of  New  Technology  Types  Across  Two  Dimensions 


IN-HOUSE  OUTSIDE 

Support  Support 


Competence 

ENHANCING 

♦Total-  1 6 

♦Rated-  1 3 

Tot  Rating-  47.0 
Median-  +5.0  .. — . 
Ave  Rating-  (+3.6) 

♦Total-  9 

#Rated=  8 

Tot  Rating-  1 1 .5 
Median-  +4.0. — 
Ave  Rating-  (+1 .4) 

BEST 

OK 

Competence 

DESTROYING 

♦Total-  1 1 

♦Rated-  8 

Tot  Rating-  -2.0 
Median-  +1.5  . — 
Ave  Rating-  (-02) 

♦Total-  23 

♦Rated-  20 

Tot  Rating-  14.5 

Median-  +0.7 

Ave  Rating-  (+0/y 

Poor 

Marginal 

RATING  SCALE:  +12  - Maximum  Success,  -12  « Maximum  Failure 
Figure  2 - Distribution  of  Success/Faflure  across  two  factors 
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Tlie  new  software  technologies  that  had  the  most  successful  ST1  experience  {though 
across  a very  limted  set  of  cases)  are  summarized  below: 
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2 
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3 
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Avfl  Rating  New  Software  Technology 

+9.5  In-House  Automated  Build  Tool 

+7.5  Microprocessor  Development  Stations  for  Integration 
+4.8  In-House  Software  Problem  Reporting  Tool 
+3.3  In-House  Configuration  Management  (CM)  Tod 

+2.1  In-House  Program  Design  Language  Tool 


The  new  software  technologies  that  had  the  least  successful  ST1  experience  are: 


#Cases  Ave  Rating  New  Software  Technology 

1 -110  In-House  Automated  Code  Documentation  Tod 

2 -8.8  Workstation-based  Engineering  Documentation  Tool 

4 -4.8  Workstation-based  CASE  Tool  for  Req*ts  and  Design 


Among  the  overall  conclusions  from  the  study  are: 

1 . Saving  schedule  time  and  labor  costs  are  necessary  and  sufficient 
conditions  for  successful  STI 

2.  Improving  quality  is  a necessary,  but  not  suffitient  condtion  for 
successful  Software  Technology  Insertion  (STI) 

3.  Success  with  new  software  technology  insertion  (STi)  is  much  greater  for 
competence-enhancing  than  for  competence-destroying  technologies 

4.  Success  with  STI  is  somewhat  greater  for  in-house  supported 
technologies  than  for  outside  supported  technologies 

5.  Success  with  STI  is  greater  for  mature  technologies  than  for  either  young 
or  old  technologies  (mature  is  >1  year  after  release,  <5  yeas  after  release) 

6.  Success  with  STI  is  greater  when  users'  expectations  about  ‘new 
technology*  are  controlled  to  avoid  expecting  too  much  - exceeding  users' 
expectations  is  not  necessary  for  successful  STI,  but  ngl  meeting 
expections  (i.e.,  disappointing  them)  is  a sufficient  condition  for  failure 

7.  Success  with  STI  can  be  increased  when  there  is  synergy  between 
multiple  new  technologies,  such  as  Ada  and  workstations 


These  and  other  results  and  conclusions,  along  with  some  recommendations  fcr  large 
software  development  organizations,  will  be  covered  at  the  workshop. 


nUKmci  [i]  Tiatwn.  M.  andAndaraon.  P,  Tachnolcqlcal  ObcondnuMaa  and  Oryrtraflorad  EaWorananta*. 
AdnrtftmNw  Sdtncm  Quarterly,  Sapt  1060. 
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13  Software  Projects 


Project  IM 

Language 

SLOC 

Current  Phase 

1 

Assembly 

4920C 

Integration  Test 

Fortran 

6400 

C 

2900 

2 

C 

9100 

Design  & Code 

3 

Assembly 

4000 

Maintenance 

C 

4500 

4 

C 

8500 

Integration  Test 

5 

Assembly 

1085C 

Design  & Code 

6 

Assembly 

7100 

System  Test 

7 

Assembly 

1600C 

Integration  Test 

8 

Fortran 

n/a 

Cancelled 

9 

Ada 

2500C 

Design 

10 

Ada 

n/a 

Integration  Test 

11 

Assembly 

4850 

Integration  Test 

12 

C 

1370C 

Maintenance 

13 

Assembly 

1800C 

Design  & Code 

21  New  Software  Technologies 

(most  of  them  new  tools,  methods,  languages) 

A - The  use  of  Ada®  as  an  implementation  language 
B - In-House  automated  build  tool(s) 

C - In-House  automated  code  documentation  tool 
D - In-House  program  design  language  (PDL)  tools 
E - In-House  metrics  tools  for  automatic  data  collection 
G - In-House  standard  test  reporting  tool  based  on  RDBMS 
I - Workstation-based  engineering  documentation  tool 
J - The  use  of  M68020  assembly  language 

K - Microprocessor  Development  Stations  (MDS)  for  integration  testing 
L - In-House  project  scheduling  and  reporting  tool 
M - In-House  configuration  management  (CM),  source  code  control  tool 
N - In-House  Vax/Unix  documentation  package  using  troff 
P - In-House  Software  Problem  Reporting  Tool  based  on  RDBMS 
R - Rapid-Prototyping  in  requirements  or  design 
S - Structured  analysis  graphical  CASE  tool 
T - Structured  analysis  graphical  CASE  tool 
U - Reusable  Software  in  design  or  coding 
W - The  use  of  workstations  as  primary  development  platforms 
X - Workstation-based  engineering  documentation  tool 
Y - In-House  requirements  traceability  tool 
Z - The  use  of  DoD-STD-21 67  or  21 67A 


— 4 


Project/Technology  Matrix 


New  Technology  ID 


59  STI  Cases:  % Rated  Q Not  Rated 


Measuring  Perceived  STI  Success 

• For  each  STI  Case,  6 Questions  were  asked  of: 

(1 ) Lead  Engineer  (Project/Matrix) 

(2)  Dept  Manager  (Functional/Matrix) 


For  Each  STI  Case: 

Statement  (Agree  or  Disagree?) 

Agree — Disagree 
+2  +1  0 -1  -2 

1. 1 would  use  the  new  method/tool  again 

2.  The  new  method/tool  saved  schedule  time 

3.  The  new  method/tool  saved  labor  cost 

4.  The  new  method/tool  saved  computer  cost 

5.  The  new  method/tool  improved  quality 

6.  The  new  method/tool  met  my  expectations 

V 

V 

V 

V 

• Total  Rating  for  each  STI  Case  is  sum  (example  =+1 ) 


i.e.,  maximum  = +12,  minimum  = -12 
(Note:  Questions  not  weighted) 
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Software  Technology  Insertion  (STI) 


Software  Technology  Insertion 
"New"  Software  Technology 

Opportunity  to  Insert 


"New"  Software  Technology 
+ Opportunity  to  Insert 

Tool  or  Method  that  is  unfamiliar 
to  the  majority  of  a Project  Team, 
usually  replacing  a more  familiar  one 

A software  development  activity  on 
a new  (most  likely)  or  ongoing  (less 
likely)  software  project 


Pact  10  of  24 


UNCLASSIFIED 
Missile  Systems  Division 


Software  Technology  Insertion:  Success  Factors 


Raytheon 


Software  Laboratory 


Nov  28. 1990 


Successful  STI 


Perceived  STI  Success  User's  sense  of  Labor  Cost  Savings  + 

User's  sense  of  Computer  Cost  Savings  + 
User's  sense  of  Elapsed  Time  Savings  + 
User's  sense  of  Quality  Improvement 

Real  STI  Success  Measured  Labor  Cost  Savings  + 

Measured  Computer  Cost  Savings  + 
Measured  Elapsed  Time  Savings  + 
Measured  Quality  Improvement 
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STI  "Cases"  Overview 

STI  Case  A single  incident  of  STI 

on  a single  project,  usually 
within  a single  development  phase 

59  STI  Cases  Identified  Across  1 3 different  projects; 

from  1 to  7 STI  Cases  per  project 

49  STI  Cases  Rated  for  Some  of  the  59  identified  cases 
Perceived  Success  were  not  able  to  be  rated 

13  Different  SW  Projects  Some  ongoing,  some  just  completed; 

using  Ada,  C,  Fortran,  Assembly; 
ranging  in  size  from  2900  to  49200  SLOC 

„ p 21  Different  SW  Technologies  Most  new  tools,  methods,  languages  (e.g., 

CASE,  2167A,  Ada,  Rapid-Proto,  Reuse,...) 
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13  Software  Projects 


Project  ID# 

Language 

SLOC 

Current  Phase 

1 

Assembly 

Fortran 

C 

49200 

6400 

2900 

Integration  Test 

P 

C 

9100 

Desian  A Code 

3 

Assembly 

C 

4000 

4500 

Maintenance 

4 

C 

8500 

Intearation  Test 

5 

Assembly 

10850 

Design  & Code 

6 

Assembly 

7100 

System  Test 

7 

Assembly 

16000 

Intearation  Test 

8 

Fortran 

n/a 

Cancelled 

9 

Ada 

25000 

Design 

1 10 

Ada 

n/a 

Intearation  Test 

11 

Assembly 

4850 

Intearation  Test 

12 

C 

13700 

Maintenance  { 

13 

Assembly 

18000 

Design  & Code 
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21  New  Software  Technologies 

(most  of  them  new  tools,  methods,  languages) 

A * The  use  of  Ada®  as  an  Implementation  language 
B - In-House  automated  build  tool(s) 

C - In-House  automated  code  documentation  tool 
D - In-House  program  design  language  (PDL)  tools 
E - In-House  metrics  tools  for  automatic  data  collection 
Q - In-House  standard  test  reporting  tool  based  on  RDBMS 
I - Workstation-based  engineering  documentation  tool 
J - The  use  of  M68020  assembly  language 
K - Microprocessor  Development  Stations  (MDS)  for  integration  testing 
L • In-House  project  scheduling  and  reporting  tool 
M - In-House  configuration  management  (CM),  source  code  control  tool 
N • In-House  Vax/Unix  documentation  package  using  troff 
P • In-House  Software  Problem  Reporting  Tool  based  on  RDBMS 
R - Rapid-Prototyping  in  requirements  or  design 
S - Structured  analysis  graphical  CASE  tool 
T - Structured  analysis  graphical  CASE  tool 
U - Reusable  Software  in  design  or  coding 
W • The  use  of  workstations  as  primary  development  platforms 
X - Workstation-based  engineering  documentation  tool 
P J!  Y - In-House  requirements  traceability  tool 

||  Z - The  use  of  DoD-STD-21 67  or  21 67 A 
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New  Technology  ID 
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Other  Measured  Factors 


Technology  Type  (Competence-Enhancing  or  -Destroying) 
Support  Type  (In-House  or  External) 

Maturity  of  the  Technology  (Young,  Mature,  and  Old) 
Pro|ect  Size  (SLOC) 

Prior  Expectations  (for  success  or  failure) 

Reasons  (for  STI  choice) 

Methods  (of  STI  insertion) 

• Perception  of  Tln^e  Savings 

• Perception  of  Labor  Savings 

• Perception  of  Computer  Cost  Savings 

• Perception  of  Quality  Improvement 

> Result  vs.  Prior  Expectations  (for  success  or  failure) 
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Measui  ing  Perceived  STI  Success 


• For  each  STI  Case,  6 Questions  were  asked  of: 

(1)  Lead  Engineer  (Project/Matrix) 

(2)  Dept  Manager  (Functional/Matrix) 


For  Each  STI  Case: 

Statement  (Agree  or Disagree?)  _ .... 

Agree Disagree 

+2  +1  0 -1  -2 

1. 1 would  use  the  new  method/tool  again 

^ 

2.  The  new  method/tool  saved  schedule  time 

jL 

3.  The  new  method/tool  saved  labor  cost 

V 

4.  The  new  method/tool  saved  computer  cost 

V 

5.  The  new  method/tool  Improved  quality 

i 

6.  The  new  method/tool  met  my  expectations 

_4_ 

• Total  Rating  for  each  STI  Case  is  sum  (example  =+1) 
i.e.,  maximum  = +12,  minimum  = -12 
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(Note:  Questions  not  weighted) 
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Histogram  Of  Software  Technology  Insertion  Cases 
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Main  reasons  tor  "success": 

• "Synergy"  wtthin  a project  (4) 

• Critical  need  for  a capability  (3) 

• "Synergy"  between  two  technologies  (2) 

• Mature  and  powerful  tool  (1) 

May  or  may  not  "Sava  Computer  Costs"  (+0.2) 

"Met  Expectations"  (+0.5)  not  as  critical  as: 

• ‘Save  Time"  (+1.0) 

• "Save  Labor"  (+1.7) 

• "Improve  Quality"  (+1.6) 


Failure  -«*- 


Success 
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" Bottom  Seven " 

- 1 (Rating  £-7) 
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■nil  ii 

■miinii 
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-12 


-6 


+6 


+12 


Bottom  Seven  STI  Cases 

Main  reasons  (or  "(allure": 

• Immature  technology  (3) 

• Interface  problems  (3) 

• Technology  not  “needed"  by  LE  (2) 

• Wrong  technical  solution  (1) 

May  or  may  not  "Improve  Quality"  (-0.4) 

"Save  Computer  Costs”  (-1.1)  not  as  critical  as: 
•"Save  Time"  (-1.8) 

• "Save  Labor"  (-1.9) 

• "Met  Expectations"  (-1.9) 
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Competence-Enhancing  vs  Competence-Destroying 

Competence-Enhancing  technology  - major  improvement  in  price/performance 
that  builds  on  existing  know-how;  a substitute  for  older  technology,  but  does 
not  render  old  skills  obsolete;  increase  efficiency. 

Competence-Destroying  technology  - new  way  of  making  a given  product; 
requires  new  skills,  abilities,  and  knowledge  for  use;  may  combine  previously 
discrete  steps  into  continuous  flow,  or  be  a completely  different  process 


Maturity  of  a New  Software  Technology 
Young  technology  - Released  < 1 year,  or  prior  to  2nd  major  release  (VI  .x) 
Mature  technology  - Released  > 1 year,  and  after  2nd  major  release  (V2.x+) 
| £ Old  technology  - Released  > 5 years,  or  after  end  of  formal  support 
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Raytheon 
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Nov  28. 1990 


Ratings  of  New  Technology  Types  Across  Two  Factors 


IN-HOUSE  Support  OUTSIDE  Support 


Competence 

ENHANCING 

("incremental") 

#Total=  16 

#Rated=  13 

Tot  Rating=  47.0 
Medians  +5.0^> — 
Mean  Ratings  (+3.6) 

#Total=  9 

#Rated=  8 

Tot  Ratings  11.5 
Medians  +4.0  — v. 
Mean  Ratings  (+1.4) 

BEST 

OK 

Competence 

DESTROYING 

("radical") 

#Total=  1 1 

#Rated=  B 

Tot  Ratings  -2.0 

Medians  +1.5/ >. 

Mean  Ratings  Qo.2J 

# Totals  23 

«Rateds  20 

Tot  Ratings  14.5 
Median-  +0.7 ^ — ■>. 
Mean  Rating-  (*0.7) 

Poor 

Marginal 

Mean  = +2.1 

Mean  = +0.9 

Moan  s +2.8 


Moan  & +0.4 


RATING  SCALE:  +12  « Maximum  Success,  -12  - Maximum  Failure 
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Software  Technology  Insertion:  Success  Factors 


Raytheon 


Software  Laboratory  Nov  28, 1990 

Summary  of  Results 

(Focus  on  success  factors  rather  than  successful  technologies) 

(Focus  on  perceived  rather  than  real  STI  success) 

• Saving  schedule  time  and  labor  costs  drive  successful  STI  (obvious?) 

• Improving  quality  is  necessary,  but  not  sufficient  for  successful  STI 

• Exceeding  users'  expectations  not  necessary  for  successful  STI,  but 
not  meeting  expectations  is  sufficient  for  failure  (i.e.,  must  control) 

• Much  greater  success  for  competence-enhancing 
vs  competence-destroying  technologies 

• Greater  success  for  mature  vs  young  or  old  technologies 

| £ • Somewhat  greater  success  for  in-house  vs  outside  supported 
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Next  Step: 

Linking  Perceived  Success  with  Real  Success 
via  Software  Metrics  Collection 


• Corporate-wide  effort  to  implemebi  automatic  collection  of  software 
metrics  as  a by-product  of  development  - MSD  is  Lead  Division 

•10  current  software  metrics  defined  (similar  to  Mitre  Metrics) 

• Based  mainly  on  DoD-STD-2167A 

• AutoCollection  in  development  for  both  project-specific  and 
process-level  (across  multiple  projects)  software  metrics 
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Overview  of  Raytheon  MSD's  Software  Metrics  Collection 
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ABSTRACT 


Due  to  the  large  number  of  product,  project  and  people  parameters  which  impact  large  custom 
software  development  efforts,  measurement  of  software  product  quality  is  a complex  undertaking.  Fur- 
thermore, the  absolute  perspective  from  which  quality  is  measured  (customer  satisfaction)  is  intangible. 
While  we  probably  can’t  say  what  the  absolute  quality  of  a software  product  is,  we  can  determine  the 
relative  quality,  the  adequacy  of  this  quality  with  respect  to  pragmatic  considerations,  and  identify  good 
and  bad  trends  during  development.  While  no  two  software  engineers  will  ever  agree  on  an  optimum 
definition  of  software  quality,  they  will  agree  that  the  most  important  perspective  of  software  quality  is 
its  ease  of  change.  We  can  call  this  flexibility,  adaptability  or  some  other  vague  term,  but  the  critical 
characteristic  of  software  is  that  it  is  soft  The  easier  the  product  is  to  modify,  the  easier  it  is  to  achieve 
any  other  software  quality  perspective. 

This  paper  presents  objective  quality  metrics  derived  from  consistent  lifecycle  perspectives  of  rework 
which,  when  used  in  concert  with  an  evolutionary  development  approach,  can  provide  useful  insight 
to  produce  better  quality  per  unit  cost/schedule  or  to  achieve  adequate  quality  more  efficiently.  The 
usefulness  of  these  metrics  is  evaluated  by  applying  them  to  a large,  real  world,  Ada  project  (CCPDS-R). 

These  measures  can  be  automated,  consistent,  and  easy  to  use.  Along  with  subjective  interpretation 
to  account  for  the  lifecycle  context,  objective  insight  into  product  quality  can  be  achieved  early  where 
correction  or  improvement  can  be  instigated  more  efficiently. 

Index  Term#-  Evolutionary  Development,  Software  Quality  Metrics,  Ada,  Maintainability,  Process 
Improvement. 
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BACKGROUND 

There  have  been  many  attempts  to  define  measures  of  software  quality  in  the  past  20  years.  For  many 
reasons,  none  of  these  has  caught  on  as  accepted  practice  in  the  software  industry.  [2]  discusses  many  of 
the  problems  and  tradeoffs  associated  with  defining  and  measuring  software  quality.  One  of  the  recurring 
themes  in  this  work  was  the  need  for  subjectivity  and  expensive  human  resources  in  both  the  collection 
and  interpretation  of  quality  metrics.  Furthermore,  the  concept  of  a technology  independent  set  of 
metrics,  although  an  acknowledged  desire,  was  not  well  understood.  [8]  provides  an  excellent  discussion 
of  the  need  for  objective,  measurable  software  quality  metrics  which  remain  technology  independent.  [9] 
defines  a complete  company  metrics  program  with  actual  data  that  provides  some  valuable  experience 
and  lessons  learned.  [10]  describes  the  most  current  motivation  for  measuring  software  quality,  process 
improvement. 

After  three  years  of  successful  software  development  on  the  Command  Center  Processing  and  Display 
System  - Replacement  (CCPDS-R)  project  using  modem  Ada  software  engineering  techniques  ([12],  [13] 
and  [15]),  TRW  has  derived  a subset  of  software  quality  metrics  which  are  measurable,  objective,  and 
useful  in  providing  a basis  for  improving  downstream  quality  of  products  and  processes.  One  of  the 
problems  with  typical  government  contracted  systems  like  CCPDS-R  is  that  most  are  one  of  a kind 
projects.  This  characteristic  provides  added  complexity  to  measurement  since  the  experience  may  be  only 
partially  useful  between  different  project  domains.  The  metrics  presented  herein  hare  been  fozmnlaXed 
to  be  as  useful  as  possible  while  remaining  relatively  domain  independent  so  that  comparisons  between 
different  projects  are  possible.  This  is  not  as  simple  as  it  sounds  and  the  literature  on  software  quality 
metrics  reinforces  this  experience.  After  many  iterations,  the  data  presented  herein  has  demonstrated 
objective  and  valuable  insight  in  its  application  to  CCPDS-R  and  it  provides  a credible  basis  from  which 
better  metrics  can  be  derived. 
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Software  Quality  Metrics  Objecting.  Software  quality  metrics  should  be  simple,  easy  to  nse,  aad 
hard  to  misuse.  They  should  be  useful  to  project  management,  stimulate  continuous  improvement  of 
our  development  process,  aad  low  cost  to  administer  consistently  across  different  projects. 

Usefulness.  Conventional  testing  techniques  exist  for  assessing  the  functionality,  reliasdUy  and 
performance  of  a software  product,  however,  there  are  no  accepted  methods  for  assessing  its  flexibility 
(modularity;  changeability,  or  maintainability ).  While  there  are  many  other  perspectives  of  quality  (e.g., 
portability,  interoperability,  etc.),  our  experience  in  executing  an  evolutionary  development  process  has 
demonstrated  that  its  flexibility  aspects  are  the  most  important.  The  easier  the  product  is  to  modify,  the 
easier  it  is  to  achieve  any  other  software  quality  perspective  except  perhaps  performance.  The  tradeoff* 
between  flexibility  and  performance  is  highly  dependent  on  the  application  domain  as  well  as  many  other 
architectural  issues  and  for  the  purposes  of  this  discussion  we  will  assume  that  performance  is  achieved 
through  proper  hardware  selection  and  that  the  project  is  prioritised  "software  first".  A project  which 
is  prioritized  more  towards  performance  (i.e.,  1750A  flight  program),  may  not  interpret  these  metrics  in 
the  same  fashion  as  a project  prioritised  towards  continuous  lifecycle  modification  (i.e.,  ground  based  Cz 
System).  This  paper  will  attempt  to  provide  useful,  objective  definitions  for  modularity,  changeability 
and  maintainability.  The  intent  of  this  metrics  program  is  to  provide  a mechanism  for  quantifying  both 
end-product  quality  as  well  as  in-progress  development  trends  toward  achieving  that  quality. 

Development  Language.  Ada  has  proven  to  support  increased  quality  and  the  evolutionary  process 
model  in  large  software  development  efforts.  Furthermore,  Ada  appears  to  be  the  language  of  choice  for 
the  majority  of  current  and  future  large  government  projects.  While  this  paper  assumes  that  Ada  is 
the  language  for  design  and  implementation  of  software  development  projects  which  use  these  software 
quality  metrics,  it  should  be  straightforward  to  adapt  this  approach  to  other  languages  through  a suitable 
redefinition  of  a Source  Line  or  Code  (SLOC). 

Development  Approach.  An  evolutionary  development  approach  as  prescribed  in  the  Ada  Process 
Model  [12]  is  necessary  to  maximize  the  usefulness  of  these  metrics  across  a broader  range  of  the  life 
cycle.  The  metrics  are  derived  from  controlled  configuration  baselines.  Therefore,  an  approach  with  early 
incremental  baselines  will  see  an  increased  benefit.  As  a prerequisite  to  understanding  the  derivation 
of  the  software  quality  metrics,  the  following  section  provides  an  overview  of  the  Ada  Process  Model 
employed  on  CCPDS-R. 


Ada  PROCESS  MODEL 

An  Evolutionary  Process  Model  Is  fundamental  to  this  approach  for  Software  Quality  Assessment. 
Without  tangible  intermediate  products,  »c:‘  vare  quality  assessment  would  be  ineffective  and  inaccurate. 
Conventional  experience  has  repeatedly  seen  projects  sequence  through  highly  successful  preliminary  and 
critical  design  phases  (as  perceived  by  conventional  Design  Review  assessment  of  design  quality)  only  to 
have  the  true  quality  problems  surface  in  the  integration  and  test  phases  with  little  or  no  time  for  proper 
resolution.  An  Evolutionary  rv^tss  Model  provides  a systematic  approach  for  achieving  early  insight 
into  product  quality  and  a wui'-'va  lifecycle  measure  for  its  assessment.  It  also  avoids  the  inevitable 
degradations  in  quality  due  to  :ate  breakage  and  rapid  fixes  which  are  shoehorned  into  the  product 
without  adequate  software  engineering. 

TRW’s  Ada  Process  Model  is,  in  simplest  terms,  a uniform  application  of  incremental  Ada  product 
evolution  coupled  with  a demonstration- based  approach  to  design  review  for  continuous  and  insightful 
thread  testing  and  risk  management.  The  techniques  employed  within  this  process  are  derived  from  the 
philosophy  of  the  Spiral  Model  [7]  with  emphasis  on  an  evolutionary  design  approach.  The  use  of  Ada 
as  the  life  cycle  Language  for  design  evolution  provides  the  vehicle  for  uniformity  and  provides  a basis 
for  consistent  software  progress  and  quality  metrics. 

TRW’s  Ada  Process  Model  recognizes  that  ail  large,  complex  software  systems  will  suffer  from  design 
breakage  due  to  early  unknowns.  It  strives  to  accelerate  the  resolution  of  unknowns  and  correction  of 
design  flaws  in  a systematic  fashion  which  permits  prioritized  management  of  risks.  The  dominant  mech- 
anism for  achieving  this  goal  is  a disciplined  approach  to  incremental  development . The  key  strategies 
inherent  in  this  approach  are  directly  aimed  at  the  three  main  contributors  to  software  diseconomy  of 
scale:  minimizing  the  overhead  and  inaccuracy  of  interpersonal  communications,  eliminating  rework,  and 
converging  requirements  stability  as  quickly  as  possible  in  the  lifecycle.  These  objectives  are  achieved 

by: 
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1.  requiring  continuous  and  early  convergence  of  individual  relations  in  a homogeneous  life  cycle 
language  (Ada). 

2.  eliminating  ambiguities  and  unknowns  in  the  problem  statement  and  the  evolving  solution  as 
rapidly  as  practical  through  prioritised  development  of  tangible  increments  of  capability. 

Although  many  of  the  disciplines  and  techniques  presented  herein  can  be  applied  to  non- Ada  projects, 
the  expressiveness  of  Ada  as  a design  and  implementation  language  and  support  for  partial  implemen- 
tation (abstraction)  provide  a strong  platform  for  creating  a uniform  approach. 

Many  of  the  Ada  Process  Model  strategies  (summarised  in  Figure  1)  have  been  attempted,  in  part,  on 
other  software  development  efforts;  however,  there  are  fundamental  differences  in  this  approach  compared 
to  conventional  software  development  models. 


Conventional  Counterpart 


PDL/HOL 

Monolithic  Development 
Integration  and  Test 
Documentation  Based  Design  Review 
Quality  by  Inspection 


Process  Model  Strategy 


Uniform  Ada  Lifecycle  Representation 
Incremental  Development 
Design  Integration 
Demonstration  Based  Design  Review 
Total  Quality  Management 


Figure  1:  New  Techniques  vs.  Conventional  Techniques 

Uniform  Ada  Lifecycle  Representation.  The  primary  innovation  in  the  Ada  Process  Modes  is  the 
use  of  a single  language  for  the  entire  software  lifecycle,  including,  to  some  degree,  the  requirements 
phase.  All  of  the  remaining  techniques  rely  on  the  ability  to  equate  design  with  code  so  that  tic  only 
variable  during  development  is  the  level  of  abstraction.  This  provides  two  essential  benefits: 

1.  The  ability  to  quantify  unit*  of  software  (design /development /lest)  work  in  one  dimension.  Source 
Lines  of  Code  (SLOC).  While  it  is  certainly  true  that  SLOC  is  not  a perfect  absolute  metsure  of 
software,  with  consistent  counting  rules,  it  has  proven  to  be  the  best  normalized  measure  md  does 
provide  an  objective,  consistent  basis  for  assessing  relative  trends  across  the  project  life  cycle. 

2.  A formal  syntax  and  semantics  for  lifecycle  representation  with  automated  verqicatim  by  an  Ada 
compiler.  Ada  compilation  does  not  provide  complete  verification  of  a component.  It  does  go  a 
long  way,  however,  in  verifying  configuration  consistency,  and  ensuring  a standard,  unamagucus 
representation. 

Incremental  Development.  Although  risk  management  through  incremental  development  is  empha- 
sised as  a key  strategy  of  the  Ada  Process  Model,  it  was  (or  always  should  have  been)  a key  part  of  most 
conventional  models.  Without  a uniform  lifecyde  language  as  a vehicle  for  incremental  design /code/ teat, 
conventional  implementations  of  incremental  development  were  difficult  to  manage.  This  management 
is  simplified  by  the  integrated  techniques  of  the  Ada  Process  Model. 

Design  Integration.  In  this  discussion,  we  will  take  a simple  minded  view  of  “design’  as  the  structural 
implementation  or  partitioning  of  software  components  (in  terms  of  function  and  performance)  and 
definition  of  their  interfaces.  At  the  highest  level  of  design  we  could  be  talking  about  conventional 
requirements  definition,  at  the  lowest  level,  we  are  talking  about  conventional  detailed  design  z nd  coding. 
Implementation  is  then  the  development  of  these  component*  to  meet  their  interfaces  while  providing 
the  necessary  functional  performance.  Regardless  of  leoelr  the  activity  being  performed  u Ada  coding. 
Top  level  design  means  coding  the  top  level  components  (Ada  main  programs,  task  executives,  global 
types,  global  objects,  top-level  library  units  , etc.).  Lower  level  design  means  coding  the  lower  level 
program  unit  specifications  and  bodies. 

The  postponement  of  all  coding  until  after  CDR  in  conventional  software  development  approaches 
also  postponed  the  primary  indicator  of  design  quality,  iategrability  of  the  interfaces.  The  Ad*  Process 
Model  requires  the  early  development  of  a Software  Architecture  Skeleton  (SAS)  as  a vehicle  far  early 
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interface  definition.  The  SAS  essentially  corresponds  to  coding  t he  top  level  components  and  their 

interfaces,  compiling  them,  and  pro  riding  adequate  drirers/stnbs  so  that  they  can  be  executed.  This 

early  development  forces  early  baselining  of  the  software  interfaces  to  best  effect  smooth  evolution,  +* 

early  evaluation  of  design  quality  and  avoidance  of  downstream  breakage.  In  this  process,  we  have 

made  integration  a design  activity  rather  than  a test  activity.  To  a large  degree,  the  Ada  language 

forces  integration  through  its  library  rules  and  consistency  of  compiled  components.  It  also  supports 

the  concept  of  separating  structural  definition  (specifications)  from  runtime  function  (bodies).  The 

Ada  Process  Model  expands  this  concept  further  by  requiring  structural  design  (SAS)  prior  to  runtime 

fa  action  (executable  threads).  Demonstrations  provide  a forcing  function  for  broader  runtime  integration  * 

to  augment  the  compile  time  integration  enforced  by  the  Ada  language.  * 

Demonstration  Based  Design  Review.  Many  conventional  projects  built  demonstrations  or  bench- 
marks of  standalone  design  issues  (e-g.,  user  system  interface,  critical  algorithms,  etc.)  to  support 
design  feasibility.  However,  the  design  baseline  was  represented  on  paper  (PDL,  simulations,  flowcharts, 
vu graphs).  These  representations  were  vague,  ambiguous  and  not  amenable  to  configuration  control. 

The  degree  of  freedom  in  the  design  representations  made  it  very  difficult  to  uncover  design  flaws  of 
substance,  especially  for  complex  systems  with  concurrent  processing.  Given  the  typical  design  review 
attitude  that  a design  is  "'innocent  until  proven  guilty*,  it  was  quite  easy  to  assert  that  the  design  was 
adequate.  This  was  primarily  due  to  the  lack  of  a tangible  design  representation  from  which  true  design 
flaws  were  unambiguously  obvious.  Under  the  Ada  Process  Model,  design  review  demonstrations  provide 
some  proof  of  innocence  and  are  far  more  efficient  at  identifying  and  resolving  design  flaws.  The  subject 
of  the  design  review  is  not  only  a briefing  which  describes  the  design  in  human  understandable  terms, 
but  also  a demonstration  of  important  aspects  of  the  design  fosettne  which  verify  design  quality  (or  lack 
of  quality). 

Total  Quality  Management  (TQM).  In  the  Ada  Process  Model  there  are  two  key  advantages  for 
applying  TQM.  The  first  is  the  common  Ada  format  throughout  the  lifecyele  which  permits  consistent 
software  metrics  across  the  software  development  work  force.  Although  these  metrics  don't  ail  pertain 
to  quality  (many  pertain  to  progress),  they  do  permit  a uniform  communications  vehicle  for  achieving 
the  desired  quality  in  an  efficient  manner.  Secondly,  the  demonstrations  serve  to  provide  a common  goal 
for  the  software  developers.  This  “integrated  product*  is  a reflection  of  the  complete  design  at  various 
phases  in  the  life  cycle  for  which  all  personnel  have  ownership.  Rather  than  individually  evaluating 
components  which  are  owned  by  individuals,  the  demonstrations  provide  a mechanism  for  reviewing  the 
team's  product.  This  team  ownership  of  the  demonstrations  is  an  important  motivation  for  instilling  a 
TQM  attitude. 

SOFTWARE  QUALITY  METRICS  APPROACH 

In  essence,  the  approach  we  are  taking  is  similar  to  that  of  (8]  who  proposes  to  measure  software  qual- 
ity through  the  absence  of  spoilage*  While  his  definitions  are  purposely  vague  (to  remain  technology  and 
project  independent),  ones  are  quite  explicit.  The  key  to  this  metrics  approach  is  similar  to  conventional 
cost  estimation  techniques  such  as  COCO  MO  [3]  where  quantifiability  and  consistency  of  application 
are  important.  Note  that  software  cost  estimation  has  subjective  inputs  and  objective  outputs.  Our 
approach  will  define  objective  inputs  which  may  require  subjective  interpretation  for  project  context. 

Our  primary  metrie  for  software  quality  will  be  rework  as  measured  by  changed  SLOC  in  configured 
baselines.  This  metric  will  also  need  to  be  adjusted  for  project  context  to  accommodate  the  product 
characteristics,  the  life  cycle  phase,  etc.  The  software  quality  assessment  derived  from  this  objective 
collection  of  rework  metrics  will  require  subjective  analysis  in  some  cases.  The  subjectivity  here  is  in 
the  fact  that  we  are  trying  to  assess  quality  during  development  (this  requires  subjective  analysis)  using 
the  same  metrics  used  to  assess  quality  following  development  (objective  analysis).  For  example,  the 
volume  of  rework  following  product  delivery  is  an  objective  measure  of  quality,  or  lack  of  quality.  The 
amount  of  rework  following  the  first  configuration  baseline  during  development  is  a subjective  measure. 

Zero  rework  might  be  interpreted  as  a perfect  baseline  (unlikely),  an  inadequate  test  program,  or  an 
unambitious  first  build.  The  following  paragraphs  define  some  of  the  foundations  in  this  approach: 

Software  Quality  Definition.  Software  quality  is  the  degree  of  compliance  with  the  customer  expec • . 

taiions  of  /unction,  performance,  cost  and  schedule.  This  is  an  incredibly  difficult  concept  to  make 
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objective.  The  only  mechanisms  available  for  defining  “customer  expectations"  are  Software  Require- 
ments Specifications  for  function. and  performance,  and  an  approved  expenditure  plan  which  quantifies 
cost  and  schedule  goals  (basically,  this  corresponds  to  the  “contract").  These  two  mechanisms  are  tra- 
dition ally  the  lowest  quality  products  produced  by  a project  since  they  are  required  to  be  agreed  upon 
with  numerous  unknowns  far  too  early  in  the  lifecycle.  The  evolutionary  process  model  and  software 
quality  metrics  should  provide  better  insight  into  the  degree  of  compliance  with  customer  expectations 
in  the  above  four  perspectives. 

Software  Change  Order  (SCO).  A Software  Change  Order  constitutes  direction  to  proceed  with 
changing  a configured  software  component.  This  change  may  be  needed  to  1)  rework  a component 
with  bad  quality  (a  fix),  or  2)  rework  a component  to  achieve  better  quality  (an  enhancement)  or 
3)  accommodate  a customer  directed  change  in  requirements.  The  difference  between  the  first  two 
types  of  rework  is  inherent  in  the  necessity  for  the  change.  If  the  change  is  required  for  compliance 
with  product  specifications,  then  the  rework  is  type  I.  If  the  change  is  desired  for  cost-effectiveness, 
increased  testability,  increased  usability,  or  other  efficiency  reasons  (assuming  the  unchanged  component 
is  compliant),  then  the  rework  is  type  2.  In  both  cases,  the  rework  should  result  in  increased  end 
product  quality  (requirements  compliance  per  dollar),  however,  type  1 also  indicates  inadequate  quality 
in  a current  busciine.  In  practice,  differentiating  between  type  1 and  type  2 may  be  quite  subjective. 
As  discussed  later,  most  of  the  metrics  are  insensitive  to  the  categorization,  but  if  the  differentiation  is 
consistently  applied,  it  can  provide  useful  insight.  Conventionally,  SCOs  were  called  Software  Problem 
Reports  (SPRs).  To  avoid  confusion  (“problem"  has  a negative  connotation,  and  not  ail  changes  are 
necessarily  problems),  we  have  changed  the  terminology.  The  software  quality  metrics  collection  and 
analysis  will  use  type  I and  type  2 SCOs  in  an  appropriate  manner.  Type  3 SCOs  need  to  be  separated 
since  they  do  not  reflect  any  change  in  quality,  they  do  however,  redefine  the  customer  expectations. 
Furthermore,  Type  3 SCOs  typically  reflect  a change  which  is  of  more  global  impact  thereby  requiring 
various  levels  of  software  and  system  engineering  *s  well  aa  high  level  regression  testing.  These  types  of 
SCOs  will  not  be  used  in  these  metrics  due  to  this  wide  range  of  variability.  Rather,  the  data  derived 
from  type  I and  type  2 SCOs  should  provide  a solid  basis  for  estimating  maintainability  and  the  effort 
required  for  type  3 SCOs. 

Source  Lines  of  Code  (SLOC).  There  has  always  been  a controversy  as  to  whether  SLOC  provides 
a good  metric  for  measuring  software  volume  (DeMarco  calls  this  6anj).  (11)  identifies  some  of  the 
precautions  necessary  when  dealing  with  SLOC.  Upon  reading  open  literature  which  discusses  project 
productivities  (SLOC/MM),  it  is  easy  to  see  that  there  is  little,  if  any,  comparability  between  projects 
within  the  same  company  no  less  projects  from  different  companies.  [4]  identifies  the  pros  and  cons  of 
various  measures  and  comes  to  the  conclusion  that  there  is  nothing  better.  Everyone  agrees  however, 
that  whatever  one  uses,  it  must  be  defined  objectively  and  consistently  to  be  of  value  for  comparison. 
How  we  define  the  absolute  unit  of  SLOC  is  not  as  important  as  defining  it  consistently  across  all  projects 
and  all  areas  of  a specific  project.  Therefore,  the  preferred  way  to  define  a SLOC  is  the  following: 

The  number  of  SLOC  for  a given  set  of  Ada  program  units  is  defined  as  the  output  of  a 
SLOC  Counting  Tool. 

Enforcing  this  definition  is  simple  to  achieve  by  providing  a portable  tool.  By  accepting  certain  nen- 
controversial  and  simple  standards  for  program  unit  headers  and  program  layout  the  tool  can  provide 
more  valuable  output  than  simply  SLOC  counts  (e.g.,  static  hierarchies,  and  complexity  ratings). 

Ada/COCOMO  (5j,  [6]  defines  SLOC  for  Ada  programs  as:  Within  an  Ada  specification  part,  each 
carriage  return  counts  as  one  SLOC.  Specifications  shall  be  coded  with  the  following  standards  (rationale 
is  provided  in  italics): 

1.  each  parameter  of  a subprogram  declaration  be  listed  on  a separate  line  ( The  design,  of  a subpro- 
gram interface  is  done  in  one  place  and  generally  the  effort  associated  unth  the  interface  design  u 
dependent  on  the  number  of  parameters.) 

2.  for  custom  enumeration  types  (e.g.,  system  state,  socket  names,  etc.)  and  record  types  each 
enumeration  or  field  should  be  listed  on  a separate  Line.  ( Custom  types  usually  involve  custom 
design  and  engineering ( hence  on  increased  number  of  SLOC,) 
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3.  for  predefined  enumeration  types  («•«••  keybo^d  key*,  eompau  direction*),  enumerations  should 
be  (Lted  on  a.  few  line.  a.  pooible  without  low  of  readability.  (There  W of  typer  jenerufly 
rctpurt  no  ctutom  engineering.) 

4.  Initialisation  of  compose  object,  (e.g.,  record,  or  arrays)  should  be  listed  with  one  component 
per  line.  (Frequently,  each  of  there  arrijnmentr  reprerentr  a autom  statement,  «n  other,  cieure 
u typically  used  /or  the  non-etutom  arrijnmentr.) 

Within  Ada  bodies  each  semi-colon  counts  a.  one  SLOC.  Generic  instantiation,  count  one  line  Cor  each 

^The  definition  shore  treat,  declarative  (speciJieation)  design  much  more  sensitively  than  it  does 
executable  (body)  design.  It  also  does  not  recognise  the  declarative  part  of  a body  u the  same  importance 
„ a specification  part  Although  these  and  other  debates  can  surface  with  respect  to  the  optimum 
definition  of  a SLOC,  the  optimum  ab.oluU  definition  is  far  less  important  than  a consistent  relate 

defiQuriT«T  Control  Board.  The  QCB  constitutes  the  governing  body  responsible  for  authoring  changes 
to  a configured  baseline  product  (conventionally  known  as  a configuration  control  board  - CCB).  This 
body  is  composed,  at  a minimum,  of  the  development  manager,  customer  representative,  each ^produd 
manager,  systems  effectiveness  representative  and  the  test  manager.  The  QCB  deadm  oa 
changes  to  configured  product,  and  approves  all  SCOs.  The  QCB  is  responsible  for  collecting  he 
Software  Quality  metrics,  objectively  and  subjectively  analysing  trends,  and  proposing  changes  to 
development  process,  tools,  product,  or  personnel  to  improve  future  quality. 

Configured  Baseline.  A configured  baseline  constitute,  a set  of  product,  which  are  subjected^ 
change  controlthrou-iT.  Quality  Control  Board  (QCB).  Configured  baseUncs  usually 
diaU  products  which  have  completed  design,  development,  and  informal  test  and  final  product,  which 

have  completed  formal  test. 

METRICS  DERIVATION 

The  remainder  of  this  paper  provides  substantial  deUil  in  the  definition  and  description  of  the 
necessary  statistics  to  be  collected,  the  metrics  derived  from  these  statistics  and  their  l“tetPr'‘* 

This  section  provides  a simple  overview  of  how  these  metrics  were  derived,  the  nec«=sUTofsomeof 
the  collected  statistics  and  their  raison  d'etre.  The  following  derivations  are  not  an  obvwus  top  down 
progression,  rather,  they  resulted  from  substantial  trial  and  error,  numerous  dead  end  analyses,  intuition 

“VhTSdLnental  hypothesis  was  that  their  was  significant  information  content  in  the  changer 
of  rework  being  performed  over  the  project  lifecycle.  The  obvious  raw  sU^tics  to  collect  uiclu  c 
number  and  t£c  of  software  changes,  SLOC  damaged,  and  SLOC  fixed.  The  P^lcm  -“  to  find 
the  right  filtering  techniques  for  the  raw  rework  statistics  which  mentify  useful  trend,  and  to  uncover 
objective  measures  which  quantify  product  attributes  both  during  development  and  as  an  end-product^ 
oL  original  intent  was  to  provide  a quantification  of  the  products  modularity,  changeabdity,  and 
maintainability.  The  first  two  are  intuitively  simple  to  define  as  a function  of  rework,  the  third  ts  more 
subtle: 

Modularity  (Q^):  The  average  extent  of  breakage.  This  identifies  the  need  to  quantify  extent  of 
breakage  (we  will  use  volume  of  SLOC  damaged)  and  number  of  instances  of  rewor*  (Number  of 
SCO,).  In  effect  we  are  defining  modularity  as  a measure  of  breakage  localisation. 

Changeability  {Qc)-  The  average  complexity  of  breakage.  This  identifies  the  need  to  quantify  com- 
p testy  of  breakage  (we  will  use  effort  required  to  resolve)  and  number  of  instances  of  rework 

(Number  of  SCOs). 

Maintainability  (Q*)t  Theoretically  the  maintainability  of  a product  is  related  to  the  productivity 
with  which  the  maintenance  team  can  operate.  Productivities  however,  are  so  difficult  to  compare 
between  projects  that  this  definition  was  intuitively  unsatisfying.  If  we  ratio  the  productivity 
of  rework  to  the  productivity  of  development,  we  end  np  with  a value  which  is  independent  of 
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productivity  but  yet  a reflection  of  the  complexity  to  change  a product  in  relation  to  the  complexity 
to  develop  it.  This  normalises  out  the  project  productivity  differences  and  provides  a relatively 
comparable  metric.  Maintainability  then,  will  be  deflned  as  the  ratio  of  rework  productivity  and 
development  productivity.  Intuitively,  this  value  identifies  a product  which  can  be  changed  three 
times  as  efficiently  (Qu  = .33)  ss  it  was  developed  as  haring  a better  (lower)  maintainability  than 
a product  that  can  be  changed  twice  as  efficiently  (Qu  = -5)  as  it  was  developed,  independent  of 
the  absolute  maintenance  productivity  realised.  The  statistics  needed  to  compute  these  values  are 
the  total  development  effort,  total  SLOC,  total  rework  effort  and  total  reworked  SLOC. 

While  the  values  above  provide  useful  end-product  objective  measures,  their  intermediate  values  as  a 
function  of  time  would  also  provide  insight  during  the  development  process  into  the  expected  end-product 
values.  Furthermore,  once  we  hare  gained  some  experience  with  maintenance  of  early  increments,  this 
experience  should  be  useful  for  predicting  the  rework  inherent  in  remaining  increments. 

The  above  brief  derivation  is  starting  to  push  the  limits  of  our  first  goal  (simplicity)  and  the  following 
sections,  on  the  surface,  will  appear  to  be  somewhat  complex.  A few  remarks  about  this  are  in  order. 
First,  there  will  always  be  a tradeoff  between  simplicity  and  real  insight.  Surface  insight  is  usually 
attained  very  simply,  detailed  insight  requires  added  knowledge  and  complexity.  We  have  chosen  a set 
of  metrics  which  range  from  simple  to  moderately  complex  to  cover  the  multiple  perspectives  needed  by 
project  management  to  ensure  accuracy.  It  is  not  necessary  to  deal  with  these  metrics  as  a complete  set. 
Subsets,  or  different  sets  are  also  useful.  Secondly,  most  of  the  analysis,  mathematics  and  data  collection 
inherent  in  these  metrics  should  be  automated  so  that  managers  need  only  interpret  the  results  and 
understand  their  basis. 

The  above  values  were  determined  through  extensive  analysis,  trial  and  error,  and  intuition.  There 
are  certainly  other  metrics  derivable  from  rework  statistics  which  would  also  provide  useful  insight.  The 
following  sections  provide  more  detailed  descriptions  and  notations  for  the  collected  statistics  (Table  1), 
in-progress  indicators  (Tabic  2),  and  end-product  quality  metrics  (Table  3).  Hypothetical  expectations 
are  provided  in  Figure  2 for  the  in-progress  indicators  and  collected  statistics. 

Collected  Statistics 

Table  1 identifies  the  necessary  statistics  which  must  be  collected  over  the  Lifecycle  to  implement  our 
proposed  metrics. 

Total  Source  Lines  The  SLOC ? metric  tracks  the  estimated  total  size  of  the  product  under  develop- 
ment. This  value  may  change  significantly  over  the  life^of  the  development  as  early  requirements 
unknowns  are  resolved  and  as  design  solutions  mature.  This  total  should  also  include  reused 
software  which  is  part  of  the  delivered  product  and  subject  to  contractor  maintenance. 

Configured  SLOC  This  metric  simply  tracks  the  transition  of  software  components  from  a maturing 
design  state  into  a controlled  configuration.  For  any  given  project,  this  metric  will  provide  insight 
into  progress  and  stability  of  the  design /development  team.  [12|  discusses  some  of  the  tradeoffs  and 
risk  management  philosophy  inherent  in  laying  out  an  incremental  build  approach.  For  projects 
with  reused  software,  there  will  be  an  early  contribution  to  SLOCc  and  thus  “immediate  progress'’ 
and  quality  metrics  as  defined  below. 

Errors  Real  errors  (type  1 SCOs)  constitute  an  important  metric  from  which. many  of  the  following  are 
derived.  The  expectation  is  that  the  highest  incidence  of  uncovering  errors  happens  immediately 
after  the  turnover  and  decreases  with  time  (i.e.,  the  software  matures). 

Improvements  The  other  stimulus  for  changing  & baseline,  improvements  (type  2 SCOs),  are  also  key 
to  the  assessment  of  quality  and  progress  towards  producing  quality.  The  expectation  for  improve- 
ments is  approximately  inversely  proportional  to  errors,  in  that  as  the  error  rate  starts  off  high  and 
damps  out,  the  improvements  start  off  low  (the  focus  is  on  errors)  and  increase.  This  phenomenon 
is  basically  derived  from  the  assumption  that  a fixed  team  is  working  the  Test/ Maintenance  pro- 
gram and: 
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Statistic 


Total  Source  Lines 


Definition 


SLOCr  = Total  Product  SLOC 


Configured  Source  Lines 


SLOCc  — Standalone  Tested  SLOC 


Errors 


SCOf  = No.  of  Open  Type  1 SCOs 
SCO{  = No.  of  Closed  Type  1 SCOs 
SCOi  = No.  of  Type  1 SCOs 


Improvements 


SCOJ  = No.  of  Open  Type  2 SCO s 
SCOJ  = No.  of  Closed  Type  2 SCOs 
SCOa  = No.  of  Type  2 SCOs 


Open  Rework 


Closed  Rework 


Total  Rework 


B\  — Damaged  SLOC  Due  to  SCO f 
Bz  = Damaged  SLOC  Due  to  SCO, 


F\  =SLOC  Repaired  after  SCO{ 
Fi  =5L0C  Repaired  after  SCO\ 


R\  = Fi  + B i 
Ri  = Fj  4-  Fa 


Insight 


Total  Effort 


Demonstrable  Progress 


Test  Effectiveness 
Test  Progress 
Reliability 


Value  Engineering 
Design  Progress 


Fragility 
Schedule  Risks 


Maturity 

Changeability 


Design  Quality 
Maintainability 
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Table  Is  Collected  Raw  Data  Definitions 
Efforts*—**  + Efforiimfwm  wnui  = ConMtant 

The  actual  differentiation  between  Type  1 and  Type  2 is  somewhat  subjective.  The  metrics  d eased 
herein  are  not  particularly  sensitive  to  either  type  since  they  rely  on  the  sum  of  the  impacts  from 
both  types.  However,  the  difference  between  Type  1 damage  and  Type  2 damage  may  provide 
useful  insight  as  demonstrated  on  CCPDS-R- 

Open  Rework  Theoretically,  all  rework  corresponds  to  an  increase  in  quality.  Either  the  rework  is 
necessary  to  remove  an  instance  of  “bad”  quality  (SCOi),  or  to  enhance  a component  for  life 
cycle  cost  effectiveness  (SCC2).  The  dynamics  of  the  rework  coupled  with  the  project*  schedule 
coatezt  must  be  evaluated  to  provide  an  accurate  assessment  of  quality  trends.  A err  tain  amount 
of  rework  is  a necessity  in  a large  software  engineering  effort.  In  (act,  early  rework  is  considered 
a sign  of  healthy  progress  in  the  evolutionary  process  modeL  Continuous  rework,  late  rework,  or 
tero  rework  due  to  the  non-existence  of  a configured  baseline  are  generally  indicators  a£  negative 
quality.  Interpretation  of  this  metric  requires  project  context.  In  general  however,  tine  rework 
must  ultimately  go  to  zero  at  product  delivery.  In  order  to  provide  a consistent  and  automatable 
collection  process,  rework  is  defined  as  the  number  of  SLOC  estimated  to  change  due  to  an  SCO. 
The  absolute  accuracy  of  the  estimates  is  generally  unimportant  and  since  open  rework  tracked 
with  an  estimate  and  doted  rework  (see  below)  is  tracked  separately  with  actuals,  tbe  values 
continually  correct  themselves  and  remain  consistent. 
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Closed  Rework  Whereas  the  breakage  metrics  estimated  the  damage  done,  the  repair  metrics  should 
identify  the  actual  damage  which  was  fixed.  Upon  resolution,  the  corresponding  breakage  estimate 
should  be  updated  to  reflect  the  actual  required  repair  that  remains  in  the  baseline.  The  actual 
SLOC  fixed  will  clearly  never  be  absolutely  accurate.  It  will,  however,  be  relatively  accurate 
for  assessing  trends  inherent  in  these  metrics.  Since  fixed  can  take  on  several  different  meanings 
depending  on  what  is  added,  deleted  and  changed,  a consistent  set  of  guidelines  is  necessary. 
Changed  SLOC  will  increase  R i without  a change  to  SLOCc • Added  code  will  increase  R\  and 
SLOCc  t although  not  necessarily  in  the  same  proportion.  Dieted  code  (not  typically  a problem) 
with  no  corresponding  addition  conld  reduce  both  R\  and  SLOCc ■ A conventional  differences 
tool  with  an  appropriate  preprocessor  which  converts  properly  formatted  source  files  into  a format 
which  contains  no  comments  and  1 SLOC  per  compared  record  would  be  the  best  method  for 
computing  changed  SLOC.  A simpler  method  (and  the  one  used  here)  would  be  to  simply  estimate 
the  magnitude  of  the  fixed  SLOC.  Given  the  volume  of  changes  and  the  need  for  only  roughly 
accurate  data  for  identifying  trends,  the  accuracy  of  the  raw  data  is  relatively  unimportant. 

In- Progress  Indicators 

Table  2 defines  the  in-progress  indicators  and  Figure  2 identifies  relative  expectations.  It  is  difficult  to 
define  the  absolute  expectations  for  the  in-progress  metrics  without  comparable  data  from  other  projects. 
Relative  expectations  are  described  in  the  following  paragraphs. 


Indicator 

Definition 

Insight 

Rework  Ratio 

RR=fz&: 

Future  Rework 

Rework  Backlog 

Open  Rework 

Rework  Stability 

SS  = (Rx  + Rt)  - (Fi  + Fj) 

Rework  Trends 

Table  2;  In  Progress  Indicator  Definitions 


Rework  Ratio  The  sum  of  the  currently  broken  product  (B\  4*  J9i)  and  the  already  repaired  breakage 
(F\  -l-  Fj)  corresponds  to  the  mass  of  the  current  product  baseline  which  has  needed  rework  (R\  4* 
Rj).  The  rework  ratio  (RR)  identifies  the  current  ratio  of  SLOCc  which  is  expected  to  undergo 
rework  prior  to  maturity  into  an  end  product.  The  expectation  for  RR  shown  in  Figure  2 is  to 
increase  to  a stable  value  with  minor  discontinuities  following  the  initial  delivery  of  each  build. 

Rework  Backlog  The  current  backlog  of  rework  is  defined  as  the  percentage  of  the  current  SLOCc 
which  is  currently  in  need  of  repair.  In  general,  one  would  expect  that  the  rework  backlog  should 
rise  to  some  level  and  remain  stable  through  the  test  program  until  it  drops  off  to  zero.  Large 
changes  from  month  to  month  should  clearly  be  investigated. 

Rework  Stability  The  difference  between  total  “ework  and  closed  rework  provides  insight  into  the 
trends  of  resolving  issues.  The  important  use  of  this  metric  is  to  ensure  that  the  breakage  rate  is 
not  outrunning  the  resolution  rate.  Figure  2 identifies  an  idealised  case  where  the  resolution  rate 
does  not  diverge  (except  for  short  periods  of  time)  from  the  breakage  rate.  Note  also  that  the 
breakage  rate  somewhat  tracks  the  SLOCc  delivery  rate.  A diverging  value  of  SS  would  indicate 
instability  of  rework  activities.  A stable  value  of  SS  would  indicate  systematic  and  straightforward 
resolution  activities. 
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Figure  2:  In-Progress  Indicators  Example  Expectations 
End-Product  Quality  Metrics 

The  end-product  metrics  reflect  insight  into  the  maintainability  of  the  software  products  with  respect 
to  type  I and  type  2 SC Os.  Type  3 SCOs  are  explicitly  not  included  since  they  redefine  the  inherent 
target  quality  of  the  system  and  tend  to  require  more  global  system  and  software  engineering  as  well  as 
some  major  re- ▼ end  cation  of  system  leYel  requirements.  Since  these  types  of  changes  are  dealt  with  in 
extremely  diverse  ways  by  different  customers  and  projects,  they  would  tend  to  cloud  the  meanings  and 
comparability  of  tie  data.  However,  the  metrics  data  below  should  be  very  helpful  in  determining  and 
planning  the  expesed  effort  for  implementing  type  3 SCOs. 

Rework  Proportions  The  Rb  *»lue  identities  the  percentage  of  effort  spent  in  rework  compared  to 
tie  total  effort.  In  essence,  it  probably  provides  the  best  indicator  of  productivity.  The  activi- 
ties included  in  these  efforts  should  only  include  the  technical  requirements,  software  engineering, 
design,  development,  and  functional  test.  Higher  level  system  engineering,  management,  configu- 
ration control,  verification  testing  and  higher  level  system  testing  should  be  excluded  since  these 
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Metric 

Definition 

Insight 

Rework  Proportions 

d Bf f9+tieor+Mt f—itao* 

ilB  ‘ £//»»«  T...4 

Rs  “ JiodrLi " 

Productivity 

Rework 

Project  Efficiency 

Modularity 

Qmmd  — 

Rework  Localization 

Changeability 

r\  _ 

VC  - SCOt+SCOt 

Risk  of  Modification 

Maintainability 

<?*  = & 

Change  Productivity 

Tabic  3:  End-Product  Quality  Metrics  Definitions 


activities  tend  to  be  more  a function  of  the  company,  customer  or  project  attributes  independent  cf 
quality.  The  goal  here  is  to  normalise  the  widely  varying  bureaucratic  activities  out  of  the  metrics. 
Rs  provides  a value  for  comparing  with  similar  projects,  future  increments,  or  future  projects  as 
well  as  other  in  progress  analyses.  Basically,  it  defines  the  proportion  of  the  product  which  had  to 
be  reworked  in  its  lifecycle.  Note  that  the  actual  value  could  be  greater  than  100%  . 

Modularity*  This  value  identifies  the  average  SLOC  broken  per  SCO  which  reflects  the  inherent  ability 
of  the  integrated  product  to  localise  the  impact  of  change.  To  the  maximum  extent  possible,  OCBs 
should  ensure  that  SCOs  are  written  for  single  source  changes. 

Changeability  This  value  provides  some  insight  into  the  ease  with  which  the  products  can  be  changecL 
While  a low  number  of  changes  is  generally  a good  indicator  of  a quality  process,  the  magnitude 
of  effort  per  change  is  sometimes  even  more  important. 


Maintainability  This  value  identifies  the  relative  cost  of  maintaining  the  product  with  respect  to  its 
development  cost.  For  example,  if  R&  = Rs,  one  could  conclude  that  the  cost  of  modification  is 
equivalent  to  the  cost  of  development  from  scratch  (not  highly  maintainable).  A value  of  Qm  much 
less  than  1 would  tend  to  indicate  a very  maintainable  product,  at  least  with  respect  to  development 
cost.  Since  we  would  intuitively  expect  maintenance  costs  of  a product  to  be  proportional  to  its 
development  cost,  this  ratio  provides  a fair  normalisation  for  comparison  between  different  projects. 
Since  the  numerator  of  Q m is  in  terms  of  effort  and  its  denominator  is  in  terms  of  SLOC,  it  is  a 
ratio  of  productivities  (i.e.,  effort  per  SLOC).  Some  simple  mathematical  rearrangement  will  snow 
that  Qm  is  equivalent  to: 


Expectations  It  is  difficult  to  define  the  expectations  for  the  end-product  metrics  without  comparable 
data  from  other  projects.  Now  that  we  have  solid  data  for  CCPDS-R,  we  can  form  expectations 
for  future  Increments  of  CCPDS-R  as  well  as  other  projects. 


The  above  descriptions  identify  idealised  trends  for  these  metrics.  Undoubtedly,  real  project  sit- 
uations will  not  be  ideal.  Their  differences  from  ideal,  however,  are  important  for  management  and 
customer  to  comprehend.  Furthermore,  the  application  of  these  metrics  on  project  increments  as  well 
as  the  project  as  a whole,  should  be  useful. 
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application  results 


Figure  3,  Figure  4 and  Tabic  4 provide  the  actual  data  to  date  for  the  CCPDS-R  project.  The  Command 
Center  Processing  and  Display  System  Replacement  (CCPDS-R)  project  will  provide  display  Information 
used  during  emergency  conferences  by  the  National  Command  Authorities;  Chairman,  Joint  Chiefs  of 
Staff;  Commander  in  Chief  North  American  Aerospace  Command;  Commander  in  ChiefUnited  States 
Space  Command;  Commander  in  Chief  Strategic  Air  Command;  and  other  nuclear  capable  Commanders 
in  Chief.  It  is  the  missile  warning  element  of  the  new  Integrated  Tactical  Warning/ Attack  Assessment 
System  developed  by  North  American  Aerospace  Defense  Command/Air  Force  Space  Command. 

The  CCPDS-R  project  is  being  procured  by  Air  Force  Systems  Command  Headquarters  Electronic 
Systems  Division  (ESD)  at  Hanscom  AFB  and  was  awarded  to  TRW  Defense  Systems  Group  in  June 
1987.  TRW  will  build  three  subsystems.  The  first,  identified  as  the  Common  Subsystem,  is  30  months 
into  development.  The  Common  Subsystem  consists  of  350,000  source  lines  of  Ada  with  a development 
schedule  of  38  months.  It  will  be  a highly  reliable,  real-time  distributed  system  with  a sophisticated 
User  Interface  and  stringent  performance  requirements  implemented  entirely  in  Ada.  CCPDS-R  Ada 
risks  were  originally  a very  serious  concern.  At  the  time  of  contract  definition,  Ada  host  and  target 
environment,  along  with  Ada  trained  personnel  availability  were  questionable. 

The  data  provided  in  this  paper  was  collected  by  manually  analysing  1500+  CCPDS-R  SCOs  main- 
tained online  and  in  hard  copy  notebooks.  Most  of  the  data  defined  in  the  previous  section  was  available 
in  the  SCOs.  Each  problem  description  and  resolution  was  evaluated  to  determine  whether  the  SCO 
was  type  1 or  type  2 and  whether  the  SCO  was  relevant  to  the  operational  product  (out  of  the  1500 
SCOs,  910  were  relevant,  the  remainder  were  SCOs  for  initial  turnovers,  support  tools,  test  software  or 
commercial  software).  Furthermore,  each  SCO  opened  contained  an  estimate  of  the  effort  to  fix  and 
each  closed  SCO  provided  the  actual  (technical)  effort  required  for  the  fix.  The  statistic  which  was  not 
present,  unfortunately,  was  the  actual  breakage  assessment  in  SLOC.  For  each  relevant  SCO,  the  SLOC 
breakage  estimate  was  based  on  experience  with  the  fix,  the  detailed  description  of  the  resolution,  the 
hours  of  analysis  and  the  hours  required  for  implementing  the  fix.  While  not  perfectly  accurate  in  ail 
cases,  these  estimates  are  at  least  consistent  relative  to  each  other  and  given  the  large  sample  space, 
relatively  accurate  for  the  intended  use.  Again,  it  is  not  that  important  to  be  absolutely  exact  when  the 
metrics  and  trends  are  derived  from  a large  sample  and  only  useful  to  at  most  1 or  2 digits  of  accuracy. 


CCPDS-R  Common  Subsystem  Analysis 

The  following  paragraphs  discuss  the  quality  metrics  results  for  the  CCPDS-R  common  subsystem 
as  a whole  with  conclusions  drawn  where  applicable.  Figure  3 provides  CCPDS-R  actuals  witlr  the 
incremental  build  sequence  (SLOCc)  overlayed  for  comparison. 

Configured  SLOC.  The  CCPDS-R  installments  of  SLOCc  delivered  small  initial  builds  (A0/A1 
and  A2)  with  the  highest  risk  components.  The  middle  build  (A3),  while  less  risky,  was  bulky  and  a 
substantial  portion  of  the  build  was  produced  by  (somewhat  immature)  automated  tools.  Nevertheless, 
it  was  installed  in  two  increments  (A31  and  A32). 

SCOs.  As  expected,  the  SCO  rate  is  proportional  to  the  SLOCc  rate.  The  actuals  also  suggest 
that  the  state  of  the  first  two  builds  was  higher  quality  at  delivery  than  the  third  build.  The  feeling 
of  the  development  managers  on  the  project  concurs  with  this  assessment  but  also  added  that  it  was 
during  the  A3-A4  timeframe  when  substantial  requirements  volatility  occurred  in  the  user  interface  and 
external  interface  definitions.  The  number  of  open  SCOs  has  remained  fairly  constant  with  respect  to 
the  number  generated  and  hence  indicative  that  the  rework  is  being  resolved  in  a timely  fashion. 

Rework  Resolution.  The  total  rework  (Rx  + Rj)  has  also  grown  at  a rate  proportional  to  SLOCc 
growth  but  its  rate  of  growth  is  decreasing.  Now  that  the  software  is  all  configured  and  turnovers  are 
complete,  breakage  should  start  damping  out  rapidly.  The  resolved  rework  (Fi  + Fj)  tracked  the  total 
rework  closely  with  little,  if  any  divergence.  The  last  three  months  indicate  that  the  rate  of  resolution  is 
exceeding  the  rate  of  breakage.  This  should  indicate  to  the  management  team  that  no  serious  problems 
are  lurking  in  the  future. 

Rework  Ratio.  The  rework  rate  has  grown  from  the  initial  builds  to  an  apparently  stable  value  of 
.15.  This  would  imply  that  the  initial  build  was  more  mature  delivery  than  the  second  and  third 
builds.  With  over  98%  of  the  software  in  SLC  this  valae  should  be  expected  to  be  fairly  stable  and  a 
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Figure  3:  CCPDS-R  Collected  Statistics 


good  predictor  of  future  rework.  The  amount  of  rework  backlog  in  proportion  to  SLOCc  has  remained 
fairly  constant  and  implies  that  the  divergence  of  breakage  rate  and  resolution  rate  should  correct  itseif 
shortly.  The  situation  here  is  that  substantial  increments  are  being  added  to  SLOCc  and  an  increase 
in  breakage  vs  resolution  is  expected  since  the  development  team  is  likely  focusing  on  installing  baseline 
components  rather  than  fixing  components. 

SCO  Effort  Distributions.  Figure  4 identifies  the  distribution  of  SCOs  by  the  effort  required  for 
resolution.  This  graphic  also  suggests  that  the  software  is  generally  easy  to  modify.  A deeper  analysis 
of  the  data  shows  that  the  majority  of  complex  SCOs  occurred  in  the  more  complex  early  builds. 
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Figure  4:  SCO  Effort  Distribution 


Rework  Proportions.  Rg  (Table  4)  defines  the  percent  of  the  development  efforts  devoted  to  rework. 
Since  we  only  tracked  the  technical  effort  in  analysing  and  implementing  resolutions,  we  have  compared 
it  to  the  software  development  effort  devoted  to  the  same,  namely,  the  requirements,  design,  development 
and  test  effort.  In  both  cases  we  eliminate  the  cost  of  management,  facility,  secretarial,  configuration 
management,  quality  assurance,  and  other  level  of  effort  administrative  activities.  Note  that  we  have 
included  the  software  requirements  analysis  effort  since,  in  our  evolutionary  approach,  there  is  only  a 
subtle  difference  between  requirements  and  design.  R$  defines  the  percentage  of  source  cede  which  has 
undergone  rework.  CCPDS-R  is  currently  projecting  a rework  ratio  of  14%  . 


Metric 

Definition 

CCPDS-R  Value 

Rework  Proportions 

R (*!+«[ 
SLOCe 

6.77. 

13.57. 

Modularity 

. = — &x±Bx — 
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Changeability 
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Maintainability 

.49 

Table  4:  End-Product  Quality  Metrics  Definitions 

Modularity.  This  value  characterises  the  extent  of  damage  expected  for  the  average  SCO.  A value 
of  53  SLOC  implies  that  the  average  SCO  only  affected  the  equivalent  of  one  program  unit.  Since  most 
of  the  trivial  errors  get  caught  in  standalone  test  and  demonstration  activities,  this  value  indicates  the 
average  impact  for  the  non- trivial  errors  which  creep  into  a configuration  baseline.  This  value  suggests 
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that  the  software  design  is  flexible  but  with  so  basis  for  comparison,  this  b partly  conjecture.  An 
additional  metric  which  would  be  useful  in  assessing  modularity  would  be  the  number  of  flies  affected 
per  change.  This  would  provide  insight  into  the  locality  of  change  as  well  as  the  exteat.  Thb  infixmation 
was  not  available  in  the  CCPDS-R  historical  data,  but  it  is  being  collected  in  future  data. 

Changeability.  The  average  effort  per  SCO  provides  a mechanism  tor  comparing  the  complexities  of 
change.  As  a project  average,  16  hours  suggests  that  change  is  fairly  simple.  When  change  is  timpfc-, 
a project  is  likely  to  increase  the  amount  of  change  thereby  increasing  the  inherent  qualify. 

Rework  Improvement.  Figure  5 identifies  how  the  changeability  (Qc)  evolved  over  the  project 
schedule  to  date.  While  conventional  experience  is  that  changes  get  more  expensive  with  tune,  CCPDS- 
R demonstrates  that  the  cost  per  change  improves  with  time.  Thu  is  consistent  with  the  goals  of  «a 
evolutionary  development  approach  [12]  and  the  promises  of  a good  layered  architecture  [13]  vkere  the 
early  investment  in  the  foundation  components  and  high  risk  components  pays  off  in  the  remainder  of 
the  life  cycle  with  increased  ease  of  change.  The  tread  of  this  metric  would  indicate  that  the  CCPDS-R 
software  design  has  succeeded  in  providing  an  integrable  component  set  with  effective  control  of  breakage. 
Had  the  trend  of  thb  metric  showed  growth  in  effort  per  SCO  without  stabilisation,  management  may  be 
concerned  about  the  design  quality  and  downstream  risks  in  reworking  an  increasingly  hard  to  change 
product.  Note  that  Qc  metric?  do  not  include  the  cost  of  downstream  re- verification  of  higher  level 
requirements  since  the  broad  range  of  these  activities  would  corrupt  the  intent  of  the  metric.  Qc  bus 
been  purposely  defined  to  reflect  the  technical  risk  of  change,  not  the  cost  of  revmfication  in  a larger 
context  or  the  management  risk.  For  example,  a late  change  of  minor  complexity  could  result  in  irgTrvuw.tn 
test  by  inspection  or  a complete  reverifi cation  of  numerous  performance  threads.  Thb  mage  of  effort 
varies  with  the  context  of  the  change,  the  customer /contractor  paranob  and  a variety  of  other  iasm  i 
which  are  not  reflective  of  the  ease  of  change.  The  technical  cost  of  change  b not  closed  out  however, 
until  thb  reverification  b complete  since  it  may  result  in  reconsideration. 


Figure  5:  Rework  Improvement:  Changeability  Evolution 

Maintainability.  The  ratio  of  Rg  to  Rs  characterises  the  cost  of  reworking  CCPDS-R  components 
compared  to  developing  them  from  scratch.  Thb  value  along  with  the  change  traffic  experienced  during 
the  last  phase  of  the  life  cycle  could  be  used  to  predict  the  maintenance  productivity  expected  from 
the  current  development  productivity  being  experienced.  The  overall  change  traffic  during  development 
should  not  be  used  to  predict  operational  maintenance  since  it  b overly  biased  by  immature  prodmet 
changes.  The  FQT  phase  change  traffic  (likely  a lower  value  than  the  complete  development  lifecycle 
traffic),  b a more  accurate  measure.  A value  of  .49  seems  like  a good  maintainability  rating,  but  further 
project  data  would  permit  a better  basis  for  assessment. 

Thb  value  requires  some  caveats  in  its  usage.  First,  thb  maintenance  productivity  was  derived 
from  small  scale  maintenance  actions  (fixes  and  enhancements)  as  opposed  to  large  scale  upgrades 
where  system  engineering  and  broad  redesign  may  be  necessitated.  Secondly,  the  data  b derived  (com 
the  development  lifecycle,  therefore,  it  should  be  treated  as  more  of  an  upper  bound  in  planning  the 
expectations  during  the  maintenance  phase  of  a product  where  the  existence  of  defects  should  be  ftess 
than  that  experienced  during  development.  The  personnel  performing  the  maintenance  actions  kowevs, 
were  knowledgeable  developers  which  may  bias  the  maintainabiEty  compared  to  the  expertbe  of  the 
maintenance  team.  The  message  here,  b that  thb  data,  like  any  productivity  data,  must  be  used 
carefully  by  people  cognisant  with  its  derivation  to  ensure  proper  usage. 
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Functional  CS Cl  Analysis.  A complete  lower  level  analysis -was  performed  to  analyte  the  noons 
contributions  to  the  nines  in  Table  4 by  the  individual  CSCIs.  While  the  evaluation  of  this  lower  level 
data  will  not  be  discussed  here  in  detail,  they  did  uncover  some  interesting  phenomena  which  have  since 
been  incorporated  into  the  plans  of  future  subsystems.  There  were  significant  differences  in  the  various 
CSC!  level  values  which  provided  insight  into  various  levels  of  quality  and  the  need  for  perturbations  to 
future  plans.  The  Qjr  varied  from  .12  to  .85  across  6 CSCIs.  For  example,  relatively  low  values  were 
observed  for  algorithm  (.12)  and  display  (.27)  software  where  ease  of  change  was  a clear  design  goaL 
Higher  values  were  observed  for  the  external  communication*  software  (.51)  and  system  services  software 
(.85)  where  changes  in  an  external  message  set  for  example,  could  result  in  broader  system  impacts.  The 
range  of  values  dearly  identifies  the  relative  difference  in  risk  associated  with  changing  various  aspects 
of  the  design.  The  absolute  risk  associated  with  these  changes  is  difficult  to  assess  without  further  data 
from  other  similar  projects- 

Global  Summary.  In  general,  the  CCPDS-R  program  appears  to  be  converging  towards  a very 
high  quality  prod  net  with  high  probability.  This  assessment  is  implied  from  the  viable  stability  in  the 
quality  metrics.  The  fact  that  these  metrics  are  stable  generally  implies  that  the  remaining  efforts  are 
predictable.  If  the  predictions  do  not  extrapolate  to  better  than  required  performance,  action  can  be 
taken.  The  key  to  optimising  the  value  of  these  metrics  is  to  achieve  stabilisation  as  early  as  possible 
so  that  if  predicted  performance  does  not  match  expectations,  management  can  instigate  improvement 
actions  as  early  in  the  life  cyde  as  possible.  Some  characteristics  of  CCPDS-R  which  are  important  to 
keep  in  mind  when  interpreting  the  above  metrics  indude: 

1.  Many  changes  incurred  by  the  project  were  really  type  3 (true  requirements  change).  However, 
since  most  of  these  were  small  it  was  easier  to  incorporate  them  rather  than  go  through  the  formal 
ECP  process.  In  retrospect,  the  sum  of  all  these  little  changes  was  quite  substantial. 

2.  These  metrics  are  derived  from  the  development  phase,  comparison  with  other  project's  mainte- 
nance phase  metrics  is  misleading.  The  metrics  available  in  the  final  3 months  prior  to  delivery 
(as  opposed  to  the  lifecyde  averages  presented  here)  however,  should  be  fairly  comparable. 

Operational  Concept.  The  concept  of  operations  for  the  software  quality  metrics  program  is  to 
provide  insight  for  the  purposes  of  managing  product  devdopmest  with  minimum  interference  to  the 
development  team.  This  will  be  accomplished  by  integrating  the  standards  for  metrics  collection  into 
the  tools  and  QCB  procedures.  The  responsibilities  of  this  initiative  are  allocated  as  follows: 

Software  Developers:  Follow  the  core  Ada  Design/Devdopment  Standards 

Software  Development  Managers:  Follow  the  evolutionary  process  modd,  adhere  to  core  software 

quality  metrics  policy,  coordinate  with  project  systems  effectiveness  any  project  unique  policies, 
interpret  systems  effectiveness  SQM  analysis  and  be  accountable  for  issues  and  resolutions. 

Corporate  Systems  EfFectireness:  Define  the  SQM  policy/ tools/ proced ores,  evaluate  project  im- 

plementations, improve  the  poliaes /too is /procedures  and  ensure  consistent  usage  across  different 
projects.  This  is  the  same  function  proposed  by  [8]  as  the  standards  group. 

Project  Software  Engineering:  Flowdown  the  SQM  policy/ tools /procedures  into  a project  im- 

plementation, implement  project  QCB,  SQM  collection,  SQM  analysis,  SQM  reporting,  evaluate 
project  implementations,  and  propose  candidate  improvements  to  the  poliaes/ too  Is /procedures. 
Note  that  we  are  putting  this  function  in  the  hands  of  knowledgeable  project  personnel  (as  op- 
posed to  conventional  independent  QA  personnel)  since  the  administrators  of  these  metrics  should 
be  motivated  for  effective  use  through  ownership  in  both  the  process  and  the  products. 

We  would  foresee  SQM  metrics  reporting  on  a monthly  or  quarterly  basis  depending  on  project  phase, 
sue,  risks,  etc.  Furthermore,  the  entire  SQM  initiative  should  be  relatively  dynamic  during  its  infancy 
as  real  project  applications  determine  what  is  most  useful  and  feedback  is  incorporated. 
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SUMMARY 


By  itself,  CCPDS-R  is  pexhaps  a bad  example  for  testing  these  metrics.  In  general,  the  project  bn 
performed  aa  planned  and  has  a high  probability  of  delivering  a quality  product.  It  would  be  usrfxl 
to  examine  a less  successful  project  to  illustrate  the  tendencies  which  every  project  manager  she  old  » 
looking  for  as  indicators  of  trouble  ahead. 

Furthermore,  none  of  these  metrics  by  themselves,  provides  enough  data  to  make  an  assessment  of  a 
project’s  quality.  They  must  be  examined  as  a group  in  conjunction  with  other  conventional  nesioa 
to  arrive  at  an  accurate  assessment.  They  also  do  not  represent  the  only  set  of  useful  metrics  poss&e 
from  the  collected  statistic  on  SCOs  and  rework.  There  are  many  other  ways  to  examine  this  data  and 
present  it  for  trend  analysis.  With  further  automation,  these  other  views  would  be  simple  to  producr. 

Although  not  fully  implemented  on  a large  project  to  date,  subsets  of  the  metrics  presented  heron 
have  proven  useful  in  the  long  term  planning  and  development  process  improvement  on  CCPDS-A 
TRW  b currently  in  the  process  of  expanding  these  concepts  into  a uniform  practice  across  its  An 
software  development  projects  supported  by  automated  tools.  With  the  broad  acceptance  of  Ada  aad 
evolutionary  development  techniques,  this  approach  has  the  potential  of  providing  a uniform  tr-rhnirpr 
for  quality  metrics  collection,  reporting  and  history.  This  data  is  paramount  to  the  implemensmtioa  d 
a consistent  TQM  approach  to  software  development  for  enhanced  software  product  quality  amd  mas 
efficient  software  production.  The  following  activities  still  need  to  be  performed  to  provide  a connpkae 
initiative: 

1.  Enhance  the  standard  SCO  form  with  definitions,  standards  and  procedures  for  usage. 

2.  Develop  a portable  SLOC  Counting  Tool  (the  current  CCPD5-R  Metrics  Tool  would  satisfy  the* 
with  minor  modifications). 

2.  Identify  Ada  standards  (which  would  be  mandatory  across  all  Ada  projects)  necessary  to  gumxantcs 
consistent  metrics  collection  across  projects  and  within  projects.  This  primarily  involves  standards 
for  program  unit  headers  and  program  layout  which  are  not  controversial. 

4.  Develop  a a SCO  data  base  management  system  with  supporting  tools  for  automated  ccilecticu, 
analysis  and  reporting  in  the  formats  defined  above  and  other,  as  yet  undiscovered,  useful  fibrmaa. 

5.  Define  QCB  procedures,  guidelines  for  metrics  analysis  aad  candidate  reporting  formats. 

6.  Incorporate  this  initiative  into  corporate  policy. 

As  a conclusion,  we  should  evaluate  the  approach  presented  herein  with  our  original  goals: 

1.  Simplicity.  The  number  of  statistics  to  be  maintained  in  an  SCO  database  to  implement  ties 
approach  is  5 (type,  estimate  of  damage  in  hours  and  SLOC,  actual  hoars  and  actaal  SAOC  *» 
resolve)  along  with  the  other  required  parameters  of  an  SCO.  Furthermore,  metrics  for  SLOCc 
aad  SLOCt  need  to  be  accurately  maintained.  If  automated  in  an  online  DBMS,  the  remaining 
metrics  could  be  computed  and  plotted  from  various  perspectives  (e.g,  by  build,  by  CSC I)  rc 
a straightforward  manner.  Depending  on  the  extent  of  discipline  already  inherent  im  a project's 
CCB  and  development  metrics,  the  above  effort  could  be  viewed  as  very  simple  (as  in  the  case  t* 
CCPDS-R)  to  complex  (undisciplined,  management  by  conjecture  projects). 

2.  Ease  of  Use-  The  metrics  described  herein  were  easy  to  use  by  CCPDS-R  project  personms 
and  managers  familiar  with  the  project  context.  Furthermore,  they  provide  an  objective  basis  far 
discussing  current  trends  and  future  plans  with  outside  authorities  and  customers.  Most  trends 
are  obvious  and  easily  explained.  Some  trends  require  further  analysis  to  understand  the  uwdexiy- 
Lug  subtleties.  End-product  metrics  provide  simple  to  understand  indicators  of  different  software 
quality  aspects  for  the  purposes  of  comparison  and  future  planning  as  well  as  assessment  of  procew 
improvement. 
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3.  Probability  of  Misuse  There  are  enough  perspectives  that  provide  somewhat  redundant  views 
so  that  misuse  should  be  minimised.  Without  further  experience,  however,  it  is  not  clear  that 
contractor  and  customer  will  always  interpret  them  correctly.  Although  correct  interpretation 
could  never  be  guaranteed,  it  would  be  beneficial  to  obtain  more  experience  to  evaluate  where 
misinterpretation  is  most  likely. 
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Large  Software  Project  Issues 
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Primary  Contributors  To  Software  Diseconomy  of  Scale  (Boehm): 

• Rework 

• Interpersonal  Communications 

• Requirements  Volatility 

Ada  COCOMO  (Boehm/Royce)  Speculates  That  Economy  of  Scale  Is  Possible 

• Use  an  Evolutionary  Development  Approach 

• Use  Ada  as  a Lifecycle  Language 


! 


Optimize  Rework 

• Minimize  Ineffective  Rework 

- Do  Hard  Parts  First 

- Compartmentalized  Breakage 

• Maximize  Rework  Efficiency 

- Fix  it  Early 

- Design  for  Change 
Minimize  Interpersonal  Communications 

• Small  Expert  Design  Team 

• Layered  Architecture 

• Self  Documenting  Lifecycle  Language  (Ada) 
Optimize  Requirements  Volatility 

• Stablize  Necessary  Primitives  Early 

• Change  Requirements  As  Product  Matures 

• Design  for  Change 
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Most  Important  Software  Quality  is  that  it  is  I SOFT 

• Modularity: 

- Breakage  Extent  When  Changed 

• Changeability: 

- Complexity  of  Effort  to  Analyze/Implement  Change 

• Maintainability: 

- Productivity  of  Change 


Assumptions: 

• Evolutionary  Process  Model 

• Consistent  SLOC  Counting 

• Configuration  Control  Board  For  Change  Assessment 
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Quality:  Degree  of  compliance  with  customer  expectations  of  function , performance , 
cost  and  schedule 

Quality  Metrics  Derived  from  Measurement  of  Rework 

• Type  1 Rework:  Fix  Bad  Quality  Instance 

• Type  2 Rework:  Improve  Quality 

• Type  3 Rework:  Requirements  Change 
All  Rework  Corresponds  to  Quality  Increase 
Quality  Metrics  Approach 

• Collect  Statistics  on  Rework  over  Project  Lifecycle 

• Quantify  Meaningful  Metrics 

• Plot  Processed  Statistics  Over  Time 
f 

* - Quality  Progress  Trends 
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CCPDS-R  CDR  STATUS 


7PSS 


• Evolutionary  Approach  Permits  Tangible  Insight  into  End-product 


SDR  SSR  PDR  CDR 

Traditional  — ^ ^ ^ ^ 

SDR  SSR  PDR  CDR 

CCPDS-R  V ^ ^ 


FCA/PCA 

st. 

FCA/PCA 
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Traditional  Approach 


Software  Design  Complete 

Code  development  «10% 

Software  Integration  Negligible 

Formal  Test  0% 

Performance  Assessment  Modeling 


CCPDS-R  Approach 
Complete 
94% 

74% 

12% 

80%  of  Operational 
S/W  Demonstrated 


This  New  Approach  Enables  Early  Software  Quality  Assessment 


CCPDS-R  SCO  Experience 
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CCPDS-R  Quality  Metrics  Actuals 
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Metric 

Definition 

CCPDS-R  Value 

Rework:  R e =%  of  Effort 

6.7% 

Rework  Proportions 

Rework:  R$  =%  of  Product 

13.5% 

Modularity 

Q mod  = Average  Breakage  per  Change 

ir  o SLOC 
SCO 

Changeability 

Qc  = Average  Effort  per  Change 

1 C 7 Hra 

SCO 

Maintainability 

Qj\/  =Normalized  Rework  Productivity 

.49 

10 
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Important  Needs  For  Successful  Use: 

• Consistency  of  Application 

• Automated  Tools 

• Management  And  Practitioner  Acceptance 
Advantages: 

• Quantitative  Data  For  Decision  Making 

• Quantitative  Data  For  Subjective  Requirements  Compliance 

- Maintainability,  Modularity,  Adaptability,  etc. 

• Historical  Data  for  Better  Future  Planning 
Disadvantages: 

• No  Existing  Multi-project  Historical  Database 

- Only  CCPDS-R  Data  Exists  Now 

• No  Existing  Project  Independent  Toolsuite 
| Quality  Metrics  Can  Be  Used  Effectively! 
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JPL's  Real-Tine  Weather  Processor  Project  (RWP) 

Metrics  and  Observations  at  System  Cospletion 

Build  3 


by  Robert  E.  Loesh  (RWP  Project  Office) 
Robert  A.  Conover  (RWP  Project  Manager) 
Shan  Malhotra  (System  Analysis  Section) 


Jet  Propulsion  Laboratory 
4800  Oak  Grove  Drive 
Pasadena,  California  91109 


This  work  was  performed  by  the  Jet 
Propulsion  Laboratory,  California  Institute 
of  Technology,  for  the  FAA  through- -en 
agreement  with  NASA.  (HAS  7-918). 

This  presentation  is  an  update  'to.Jthe  November  1988  GSFC 
First  Ada  Symposium  presentation  which  provided  preliminary  data 
reflecting  the  RWP  Project  at  the  Build-3  Preliminary  Design 
Review.  This  presentation  is  based  upon  the  completion  of  the 
RWP  Build-3  development  and  the  associated  Metrics  Report  draft. 
The  RWP  Build-3  Metrics  Report  will  be  completed  in  March  1991 
and  will  be  submitted  for  public  release  which  may  take  3-5 
months.  Because  this  presentation  is  based  on  the  draft  Metrics 
Report,  prior  to  complete  validation  of  all  the  data,-  minor 
corrections  may  result  after  the  Final  Metrics  Report  is 
completed. 

The  development  of  the  RWP  System  is  sponsored  by  the 
Federal  Aviation  Administration  (FAA) . The  RWP  is  one  of  several 
weather  information  programs  the  FAA  has  identified  in  the  FAA’s 
National  Airspace  System  (NAS)  Plan,  which  describes  all  programs 
planned  for  modernizing  and  improving  air  traffic  control  and 
airway  facilities  services  by  the  year  2000. 

An  integral  part  of  the  overall  upgraded  NAS,  the  objective 
of  the  RWP  is  to  improve  the  quality  of  weather  information  and 
the  timeliness  of  its  dissemination  to  system  users.  To 
accomplish  this,  an  RWP  will  be  installed  in  each  of  the  Center 
Weather  Service  Units  (CWSUs) , located  in  21  of  the  23  Air  Route 
Traffic  Control  Centers  (ARTCCs) . The  RWP  System  is  a Prototype 
System.  It  is  planned  that  the  software  will  be  GFE  and  that 
production  hardware  will  be  acquired  via  industry  competitive 
procurement . 

The  ARTCC  is  a facility  established  to  provide  air  traffic 
control  service  to  aircraft  operating  on  Instrument  Flight  Rules 
(IFR)  flight  plans  within  controlled  airspace,  principally  during 
the  en  route  phase  of  flight.  Beginning  in  1993,  and  continuing 
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to  1998,  the  ARTCCs  will  be  reconfigured  to  include  both  en  route 
and  approach  control  functions.  The  reconfigured  facilities  will 
be  called  Area  Control  Facilities  (ACFs) . 

rwp  will  process  up  to  27  Next  Generation  Radar  (NEXRAD) 
weather  data  simultaneously  in  real-time  and  create  mosaic 
displays.  The  processed  NEXRAD  data  is  disseminated  directly  to 
meteorologists  and  FAA  aircraft  controllers.  This  information  is 
updated  every  three  to  five  minutes. 

The  RWP  project  was  started  in  November  of  1987  which 
resulted  from  the  descoping  of  the  Central  Weather  Processor 
Project  (CWP)  . At  the  time  of  the  descoping  the  CWP  was  in 
detailed  design  and  planned  for  the  "C"  programming  language 
development  environment.  RWP  is  following  DOD-STD-2167A  and  the 
software  will  be  coded  in  the  DOD  standard  ADA  programming 
language.  RWP  is  composed  of  3 incremental  development  builds 
(Build-1,  Build-2  and  Build-3) . Build-3  contains  all  of  the 
capabilities  specified  in  the  RWP  System  Specification.  There 
was  one  Preliminary  Design  Review  (PDR)  for  the  entire  system  and 
an  individual  Critical  Design  Review  (CDR)  for  each  Build.  The 
Coding  and  Unit  Testing  (CUT)  was  completed  in  February  1990. 
System  Testing  was  completed  in  June  1990.  FAA  Prototype  (FAA 
Users)  Test  & Evaluation  (PT&E)  was  completed  in  July  1990. 
Following  PT&E  several  changes  were  made  to  improve  the  Man- 
Machine  Interface  and  System  Reliability.  This  was  followed  by 
the  FAA  Formal  System  Acceptance  Test  (FSAT)  completed  in  October 
1990.  Final  as-built  documentation  and  the  FSAT  Test  Report  are 
scheduled  for  mid  January  1991. 

The  system  is  composed  of  one  CSCI  developed  by  JPL  that  has 
704  Computer  Software  Units  (CSUs)  and  is  composed  of  97,687  Ada 
Statements,  number  of  semicolon  "?H  delimiters,  (or  213,961 
Source  Lines  of  Code  ( (SL0C) ) , carriage  return  delimiters  less 
comments  and  blanks,  but  including  specifications  and  data,  type, 
declarations).  In  addition  it  has  4,330  of  "C"  SLOC. 

In  addition  to  the  software  developed  by  JPL  there  are  two 
areas  where  Commercial-Off-The-Shelf  (COTS)  software  is  used: 

- Communications  Protocols 

- Man-Machine  Interface  (DECWindows  and  DECs  Forms 
Management  System) 

Following  are  some  of  the  metrics  and  observations. 
Requirements  Metrics  and  Observations 

The  RWP  System  Specification  contains  a total  of  223 
requirements  within  70  pages  of  the  document.  On  the  average 
there  are  about  3 requirements  per  page.  This  does  not  include 
the  specification  of  the  external  RWP  System-to-System 
interfaces.  These  are  contained  in  a series  of  Interface  Control 
Documents  (ICDs) . The  System  Specification  was  approved  May 
1988.  Any  System  Specification  questions,  clarifications,  or 
additions  were  reviewed  and  negotiated  by  the  RWP  System  Design 
Teat  (SDT)  which  was  composed  of  key  technical  lead  staff  from 
each  area  (Project  Office,  System  Engineering,  Software 
Development,  Hardware  Development,  Test  and  Operations,  Product 
Assurance  and  Configuration  Management) . Results  of  these 
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meetings  vere  processed  using  Project  Configuration  Management 
procedures  and  documented  in  the  SDT  minutes  as  "Open  Issues." 
During  the  development  222  Open  Issues  vere  discussed  by  the  SOT 
and  approximately  40%  vere  external  system  interface  issues 
relating  to  ICDs.  The  222  Open  Issues  resulted  in  52  Engineering 
Change  Requests  (ECRs)  and  34  Request  for  Deviation/Waivers, 

(RDWs)  to  the  System  Specification.  RDWs  vere  used  as  the 
interim  method  for  correcting  vording  in  the  RWP  System 
Specification. 

Approximately  2/3  of  the  Open  Issues  vere  generated  by  the 
Test  and  Operation  Organization  (TOO)  resulting  primarily  from 
the  preparation  of  the  System  Integration  and  Test  Descriptions 
and  Procedures.  The  other  1/3  vere  generated  by  the  Softvare 
Development  Organization  (SDO) . 

The  52  ECRs  and  34  RDWs  caused  a significant  revork  inpacrt 
late  in  the  development  life  cycle. 

The  conclusion  ve  have  dravn  is  that  if  the  System 
Integration  Test  Descriptions  and  Procedures  had  been  prepared 
earlier  in  the  life  cycle,  most  of  the  Open  Issues  vould  have 
been  initiated  and  resolved  before  much  of  the  development  vas 
completed  or  even  started  and  the  amount  of  revork  vould  have 
been  minimized  (significantly  less) . 

SPFR  Metrics  and  Observations 

1,266  Softvare  Problem  Failure  Reports  (SPFRs)  vere 
generated  vhich  vere  based  upon  requirements  (Priority  1,2,3); 
see  DOD— STD-2 167 A error  classification. 

SPFRs  reflect  all  errors  reported  during  softvare  (CSCI)  cr 
system  related  requirements  resting.  The  only  exception  is  that 
any  errors  found  during  Coding  and  Unit  Testing  and  CSC 
Integration  Testing  still  outstanding  at  the  start  of  CSCI 
Requirements  Testing  vere  turned  into  SPFRs  at  that  time. 

Most  notable  is  the  small  number  of  SPFRs  (18%)  that  existed 
at  the  start  of  CSCI  Requirements  Testing  and  the  large  % (40%) 
of  SPFRs  found  during  System  Integration  and  Testing  (SIT) . 
Because  of  schedule  pressures  the  CSCI  Requirements  Testing  (9% 
of  errors)  vas  deleted  for  the  third  incremental  Build.  This 
explains  the  small  number  of  errors  found  (9%)  during  CSCI 
Requirements  Testing  and  likely  contributed  to  the  large  number 
found  during  system  level  testing  (SIT,  FSAT-1,  FSAT-3  = 51%). 

While  there  are  no  specific  comparisons  or  conclusions  ve 
are  prepared  to  make  on  the  SPFR  code  grovth.  It  may  serve  as  an 
important  point  of  reference  to  note  the  code  grovth  per  SPFR  for 
embedded  systems  vhere  the  memory  utilization  and  margin  is 
critical.  Our  experience  over  six  interim  error  correction 
Builds  is  that  ve  had  approximately  8.4  Ada  statements  of 
increase  for  each  SPFR  corrected.  This  does  not  provide  any 
detail  of  number  of  specific  amounts  of  code  deleted,  changed  anri 
added;  only  the  net  result. 

During  SIT  there  vere  5 errors  reported  per  1000  Ada 
statements.  A more  useful  number  is  the  error  density  per  SLOC 
vhich  allovs  for  comparison  to  numerous  density  reports  on 
previous  other  developments.  It  is  typical  in  this  phase  to  see 
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error  density  rates  in  the  3 to  10  errors  per  thousand  range  with 
the  median  falling  around  5-6  errors  per  thousand,  comparing  the 
RWP  Project  error  densities  with  other  Fortran,  "C"  type 
developments  it  is  our  observation  that  there  were  fewer 
(approximately  50%)  errors  during  the  RWP,  SIT  that  some  previous 
projects.  Some  of  this  probably  is  due  to  the  use  of  Ada. 
However,  other  factors  also  contributed  such  as  quality  of  staff, 
low  attrition  of  staff,  etc. 

Based  upon  the  number  of  work  years  of  effort  for  CSCI 
Requirements  Testing  versus  the  number  of  work  years  for  SIT,  SIT 
was  51%  more  productive  in  error  generation.  This  is  probably 
exaggerated  somewhat  due  to  the  deletion  of  the  Build-3  CSCI 
Requirements  Testing. 

The  metrics  of  the  number  of  work  days  to  fix  an  SPFR  is 
between  1 . 9 and  2 . 3 work  days . The  average  is  2 . 1 work  days  per 
SPFR  correction.  This  includes  any  design,  coding,  unit  testing, 
CSU  and  CSC  integration  and  delivery  of  the  code  to  the  Project 
Software  Library. 

The  % of  SPFRs  fixed  that  were  incorrect  or  created  other 
problems  attributable  to  the  fixed  code  was  3%  or  less.  This 
allowed  us  to  use  the  4-6  week  period  prior  to  release  of  Builds 
for  various  system  level  tests  (SIT,  FSAT,  PTSE)  to  continue  to 
be  used  SPFR  correction  rather  verification  of  the  SPFRs  fixed. 
With  2-3  months  centers  for  Build  deliveries  and  version  updates, 
it  provided  us  with  1/3  more  time  to  fix  SPFRs  and  a higher 
overall  SPFR  correction  productivity  rate  given  a fixed  period. 
Ada  Portability  Wetric3  and  Observations 

Ada  portability  was  established  as  a Project  high  priority 
design  goal.  The  object  was  to  minimize  the  various  code 
constructs  that  may  need  to  changed  using  the  same  programming 
language  and  software  design  but  different  hardware.  The 
following-  specific  design  decisions  were  made  to  meet  the 
portability  goal: 

- Ada  Programming  Language  & Standard 

- Ada  Tasking  to  minimize  Operating  System  function 
uniqueness 

DEC  Windows  (X-windows  subset)  to  minimize  the  Man- 
Machine  Interface  rework 

- Object  Oriented  Design  Methodology  to  localize  external 
interface  dependencies  and  rework 

Other  Engineering  Principles  and  Standards  to  minimize 
rework 

A tool  was  developed  and  used  by  the  Product  Assurance  staff 
to  analyze  the  code  to  identify  each  non -port able  construct  and 
provide  summary  statistics.  Because  of  the  still  less  than 
stable  industry  standards  on  X-windows  the  tool  produced  the 
portability  results  both  with  DECWindows  portable  and  non- 
portable. 

Portability  can  be  measured  any  number  of  ways. 

One  of  the  most  useful  is  to  measure  the  number  of  specific  code 
constructs  that  run  a risk  of  needing  to  be  modified  for 
execution  on  a different  computer  than  that  used  for  the  RWP 
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system  (i.e. , DEC) . This  measure  could  then  be  compared  to  the 
number  of  code  constructs  existing  in  the  developed  RWP  software. 
The  tool  does  provide  the  number  of  non-portable  constructs  (i.e. 
12,267).  However  there  is  no  measurement  of  the  number  of  total 
code  constructs  in  the  RWP  developed  code.  There  i3  a count  of 
the  number  of  Ada  statements  (i.e.  97,687).  There  may  be  1 or 
more  constructs  per  Ada  statement  but  it  is  still  a useful  number 
to  quantify  the  ratio  or  metric  of  % portable.  If  ve  divide  the 
total  Ada  statements  into  the  non-portable  constructs  we  get  the 
answer  approx.  12. 5%.  Therefore,  on  a construct  basis,  the  RWP 
system  is  at  least  87.5%  portable.  This  does  not  include  any 
changes  needed  to  accommodate  word  size  or  reformatting  to 
accommodate  storage  devices  that  are  unique.  It  should  be 
cautioned  that  using  the  difficulty  classification  to  compute 
work  hours  to  port  the  system  should  not  be  done.  Since  many  of 
the  porting  changes  for  one  type  of  construct  is  mechanically 
repeatable  and  represents  a single  instance,  worst  case,  the 
estimation  of  porting  effort  needs  to  consider  repeatability.  In 
addition,  not  all  of  the  constructs  identified  as  a porting  risk 
may  need  to  be  ported. 

However,  the  metrics  and  analysis  should  set  an  industry 
reference  point  for  specifying  design  requirements  for 
portability. 
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JPL  RUP  METRICS  AND  OBSERVATIONS  RWP 

o DATA  IS  PRELIMINARY  N'v  ! 

o MINOR  CORRECTIONS  MAY  RESULT  AFTER  VALIDATION  PROCESS 

o METRICS  REPORT  TO  BE  COMPLETE  IN  MARCH  1991 

- MILL  START  PROCEDURE  FOR  PUBLIC  RELEASE  OF  METRICS  REPORT 

- RELEASE  MAY  TAKE  3 - 5 HONTHS 
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RWP  METRICS  AND  OBSERVATIONS 

JPL 

AGENDA 

o WHAT  IS  THE  RWP  SYSTEM? 
o REQUIREMENTS  ISSUES 
o TESTING  EFFECTIVENESS 
o ERROR  DENSITY  AND  DISCOVERY  RATE 
o ADA  ERROR  CORRECTION  RATES 
o PORTABILITY  ISSUES 
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RWP  METRICS  AND  OBSERVATIONS 


WHAT  IS  THE  RWP  SYSTEM? 

o SPONSOR:  FEDERAL  AVIATION  ADMINISTRATION  (FAA) 

o PROTOTYPE  DEVELOPMENT;  EVENTUALLY  PART  OF  NATIONAL  AIRSPACE  SYSTEM 
UPGRADE 

o RWP  MILL  PROCESS  WEATHER  DATA  IN  REAL-TIME  BY  CREATING  A MOSAIC  DISPLAY 
OF  UP  TO  27  RADARS  SIMULTANEOUSLY.  THE  DATA  WHICH  IS  DISSEMINATED 
DIRECTLY  TO  THE  FAA  AIRCRAFT  CONTROLLERS  AND  METEOROLOGISTS  IS  UPDATED 
EVERY  THREE  TO  FIVE  MINUTES 

o PROJECT  MILESTONES: 

- PROJECT  START  - NOVEMBER  1987 

- CODING  COMPLETE  - FEBRUARY  1990 

- SYSTEM  TESTING  COMPLETE  ••  JUNE  1990 

- FAA  OPERATIONAL  TEST  AND  EVALUATION  - JULY  1990 

- FAA  FORMAL  SYSTEM,  ACCEPTANCE  TEST  - OCTOBER  1990 

o 1 RWP  SYSTEM  AT  21  OF  23  AREA  CONTROL  FACILITIES;  7 EXTERNAL  INTERFACES 
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WHAT  IS  THE  RUP  SYSTEM?  (Cont'd) 

S/M  INTENSIVE;  H/U  OFF-THE-SHELF 

- 1 COMPUTER  S/U  CONFIGURATION  ITEH 

• DEVELOPED  BY  JPL:  97,687  (ADA  STATEMENTS) 

213,961  (CARRIAGE  RETURNS  (COMMENTS 
AND  BLANKS)) 

4,330  (C  SLOC) 

•COMMERCIAL  OFF-THE-SHELF:  280,238  (C  SLOC) 

— COMMUNICATIONS  PROTOCOLS 
- DEC-UINDOUS 

— DEC  FORMS  MANAGEMENT  SYSTEH 


ADA,  DOD-STD-2167,  REVISION  A:  TAILORED 


JPL 


RUP  METRICS  AND  OBSERVATIONS 


WHAT  IS  THE  RWP  SYSTEM?  (Cont'd) 

- DISTRIBUTED  H/U  ARCHITECTURE 

• 10  MICRO  VAX  IIS,  3 MICRO  VAX  3600S,  1 MICRO  VAX  3200 

• VAXELH  AND  VAX/VMS  OPERATING  SYSTEMS,  DECNET,  ISO  PROTOCOLS 

- THE  RUP  SYSTEMS  ARE  SCHEDULED  TO  BE  INSTALLED  IN  THE  FAA  CONTROL 
CENTERS  BY  1994  BY  A FAA  SYSTEM  CONTRACTOR  WHO  IS  SCHEDULED  FOR 
SELECTION  IN  1992.  JPL  IS  PLANNING  TO  PROVIDE  TECHNICAL  SUPPORT  TO 
THE  FAA  THROUGH  1994  FOR  THE  INSTALLATION  OF  THE  FIRST  THREE  OF 

23  SITES. 
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RWP  METRICS  AND  OBSERVATIONS 

REQUIREMENTS  ISSUES 

0 SYSTEM  SPECIFICATION  (WRITTEN  BY  JPL  AND  FAA) 

- 205  FUNCTION  AND  PERFORMANCE  REQUIREMENTS 

- + 18  PERFORMANCE  (COUNTED  AS  1) 

*223 

o SYSTEM  SPECIFICATION  FUNCTIONAL  AND  PERFORMANCE  REQUIREMENTS  PAGES  = 70 
APPROXIMATELY  0.3  PAGES/REQUIREMENT 

o PROJECT  SYSTEM  DESIGN  TEAM  (SDT)  ADDRESSED  REQUIREMENTS  ISSUES  AT  WEEKLY 
HEETINGS 


0 ISSUE  RESOLUTIONS  WERE  DOCUMENTED  IN  DESIGN  TEAM  MINUTES  AND  PROCESSED 
VIA  CONFIGURATION  MANAGEMENT: 

- ENGINEERING  CHANGE  REQUESTS  (ECRs)  TO  SRS,  ICDs  AND  SYSTEH/SEGMEHT 

If  DESIGN  DOCUMENT 

- REQUEST  FOR  DEVIATION/WAIVER  (ROW)  TO  SYSTEH  SPECIFICATION 
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REQUIREMENTS  ISSUES  (Cont'd) 

222  OPEN  ISSUES  DISCUSSED  AT  DESIGN  TEAM 

- APPROXIMATELY  40*  WERE  INTERFACE  (ICO)  ISSUES 

RESOLUTION  RESULTED  IN: 

- 52  ECRs  TO  ICDs,  SRS  AND  SSDD 

- 34  ROMs  TO  SYSTEM  SPECIFICATION 

APPROXIMATELY  TUO-THIRDS  OF  OPEN  ISSUES  CAME  FROM  SIT  STAFF  DOING  STT 
DESCRIPTIONS  AND  PROCEDURES  (SITD/P) 

MOST  ECRs  AND  RDUs  RESULTED  IN  SOFTWARE,  DOCUMENT  AND  TEST  PROCEDURE  AND 
DATA  REWORK 

WRITE  INITIAL  VERSION  OF  SYSTEM  TEST  PROCEDURES 
AS_EARLY_AS_EOSSIBLE 
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SPFR  COUNT 

(by  phase) 

o TOTAL  APPROXIMATELY  2, 100  SPFRs  TO  DATE 

1,452  RUP  CSCI  RELATED 
1,266  PRIORITY  1 - 3 


PHASE 

SPFR  COUNT 

PERCENT 

SYSTEM  INTEGRATION  TESTING 

508 

401 

CSCI  REQUIREMENTS  TESTING 

117 

9* 

CSC  INTEGRATION  TESTING 

231 

181 

CODE  AND  UNIT 

6 

0* 

FSAT-1 

98 

8V 

FSAT-2 

35 

31 

BUILD 

172 

14% 

OTHER 

98 

8% 

17255 

irat 
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RWP  METRICS  AND  OBSERVATIONS 


RUP  SPFR  DISTRIBUTION  (BY  PHASE) 


cur  BUILD  I 


cur  BUILD  2 


CUT  BUILD  3 


CSC!  REQ  TEST 


5Y5TEI1  TEST 


CSC  DD  I 


CSC  QD  2 


CSC  BD  3 
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RUP 

METRICS  AND  OBSERVATIONS 
SPFR  DENSITY 

BWP 

Mr/' 

BUILD 

CARRIAGE 

RETURNS 

SEMI- 

COLONS 

.FROM  PREVIOUS  BUIU>  » FROM  PREVIOUS  E 
(CARRIAGE  RETURNS)  (SEMI-COLONS. 

IUILD  SPFRS 
) FIXED 

3.8 

401,544 

94,886 

mm  m w 

— 

128 

3.9 

405,537 

95,613 

3,993 

727 

55 

3.10 

408,895 

96,214 

3,358 

601 

80 

3.11 

409,362 

96,308 

467 

89 

18 

3.12 

414,546 

97,687 

5,164 

1,379 

78 

3.13 

420,804 

99,288 

6,258 

1,601 

135 

3.14 

418,433 

98,748 

-2,371* 

-504* 

95 

TOTAL  6 BUILDS 

(3.9  - 3 

.14)  16,889 

3,857 

461 

SEMI-COLONS  PER  SPFR 

8.4 

CARRIAGE  RETURNS  PER  SPFR 

36.6 

NASA/JPL 
Pace  IS  of  25 
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RWP  METRICS  AND  OBSERVATIONS 

ERROR  DENSITY  AND  DISCOVERY  RATES 

o CSCI  REQUIREMENTS  TESTING 

- STOPPED  FOR  BUILD-3: SCHEDULE  AND  RETURN  ON  INVESTMENT 

- MOVED  TO  SDO  FOR  BUILD-4 

• PRODUCTIVITY:  LESS  OVERHEAD  TO  ERROR  PROCESSING 

• EMPHASIZE  REQUIREMENTS  RESPONSIBILITY  OF  SDO  STAFF 


o SYSTEM  INTEGRATION  AND  TESTING  (SIT) 

- NINE  MONTHS  TEST  EXECUTION  PERIOD 

- RESET  AFTER  THREE  MONTHS  TO  ACCOMMODATE  LATE  SOFTWARE  DELIVERY 

- 401  OF  ERRORS  FOUND  DURING  SIT 
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RWP  METRICS  AND  OBSERVATIONS 
jpl  RWP 

ERROR  DENSITY  AND  DISCOVERY  RATES  (Cont'd)  ''W 

o FORMAL  ACCEPTANCE  SYSTEH  (FSAT) 

- FAA  TEST  WITNESS 

- lit  OF  SPFRs  FOUND  DURING  FSAT 

- TWO  FSATs 

• FSAT-1:  APPROXIMATELY  THREE  WEEKS:  98  SPFRs  <8%  SPFRs) 

• FSAT-2:  APPROXIMATELY  ONE  WEEK:  35  SPFRs  (3t  SPFRs) 


o 95+t  SYSTEM  FUNCTION  AND  PERFORMANCE  REQUIREMENTS  FULLY  VALIOATEO 


o SIT  HETRICS  AND  OBSERVATIONS 

- APPROXIMATELY  5 ERRORS  PER  1000  ADA  STATEMENTS 

- APPROXIMATELY  2.3  ERRORS  PER  1000  CARRIAGE  RETURNS 


APPROXIMATELY  1/2  LESS  THAN  TYPICAL  FORTRAN 
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RWP  METRICS  AND  OBSERVATIONS 


ERROR  DENSITY  AND  DISCOVERY  RATES  (Cont'd) 


o SIT  VERSUS  CSCI  REQUIREMENT  TESTING  (NOTE-1) 

- CSCI  REQUIREMENTS  TESTING  APPROXIMATELY  19. 5 ERRORS/TEST  WORK  YEAR 

- SYSTEM  INTEGRATION  AND  TESTING  APPROXIMATELY  29.5  ERRORS/TEST  WORK 
YEAR 


SIT  APPROXIMATELY  511  MORE  PRODUCTIVE  THAN  CSCI 
REQUIREMENTS  TESTING 


NOTE-1:  SYSTEM  = 1 CSCI 
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RWP  METRICS  AND  OBSERVATIONS 

ADA  ERROR  CORRECTION  RATES 


NUMBER  OF  SPFRs 
IN  BUILD 

WORK  DAYS 
_PER/SPFR 

108 

1.5 

128 

2.2 

2-3  MEEKS/BUILD 

55 

2.1 

80 

1.9 

8 BUILDS  OVER  6 MONTHS 

98 

2.6 

78 

2.3 

EXPERIENCED  RUP/ADA  STAFF 

135 

2.3 

95 

1.9 

16.9  = 2.1  AVERAGE  WORK  DAYS/SPFR  CORRECTION 

o TYPICAL  WORK  DAYS  PER  SPFR  APPROXIMATELY  1.9  TO  2.3 
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ADA  ERROR  CORRECTION  RATES  (Cont'd) 
o 96+%  OF  CORRECTIONS  HERE  VALID: 

- UE  WERE  ABLE  TO  HAKE  ONE  ADDITIONAL  BUILD  PRIOR  TO  FSATs  TO 

INCREASE  RELIABILITY 
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ADA  PORTABILITY  METRICS  AND  OBSERVATIONS 
0 PROJECT  ESTABLISHED  PORTABILITY  AS  DESIGN  OBJECTIVE  EARLY 
o PERFORMED  ANALYSIS  USING  THREE  TOOLS  AND  LIMITED  HUMAN  ANALYSIS 

- ADA  COMPILER 

- JPL  DEVELOPED  TOOL: 

* SEE  PAPER  BY  BORIS  SHENKER  AND  HERHAN  GUARDA 

AN  AUTOMATED  TOOL  FOR  PORTABILITY  ANALYSIS  OF  ADA  CODE  OF  THE 
REAL-TIME  WEATHER  PROCESSOR  PROJECT 
PRESENTED  AT  NINNOWBROQK  WORKSHOP,  JULY  1990 

- ADA-MAT:  FOR  VALIDATION 


0 
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PORTABILITY  HAS  THREE  LEVELS  OF  RISK: 

EFFORT  TO  CONVERT 
-LOW  0-2  WORK  HOURS 

- MEDIUM  2-8  WORK  HOURS 

- HIGH  OVER  8 WORK  HOURS 
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ADA  PORTABILITY  METRICS 


Njrtfi- 


TOTAL  UNITS 

PORTABLE  UNITS 

NON-PORTABLE  UNITS 

UNITS  WITH  HIGH  RISK  CONSTRUCTS 

UNITS  ONLY  WITH  LOVJ  RISK  CONSTRUCTS 

TOTAL  ADA  STATEMENTS  (;) 

TOTAL  NON-PORTABLE  CONSTRUCTS: 

- HARDWARE 

- OPERATING  SYSTEM** 

- ADA  COMPILER 

- COMMERCIAL  OFF-THE-SHELF  (COTS) 


100% 


97,687 

12,267 

1,192 

2,290 

5,220 

3,565 


100% 


DOES  NOT  INCLUDE  DATA  ISSUES  (E.G.  WORD  SIZE,  STORAGE  ISSUES) 


DOCS  NOT  INCLUDE  PARAMETER  SETTINGS 
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ADA  PORTABILITY  METRICS 

X-MIHDOWS  PORTABLE 


■0=9 

RWP 
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* DIFFERENCE 


TOTAL  NON-PORTABLE  CONSTRUCTS 

11,444 

n 

HARDWARE 

1,192 

0 

OPERATING  SYSTEM** 

2,290 

0 

ADA  COMPILER 

5,220 

0 

COTS 

2,742 
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Abstract 

Early  experiences  building  a software  quality  prediction  model  are 
discussed.  The  overall  research  objective  is  to  establish  a capability  to  project 
a software  system's  quality  from  an  analysis  of  its  design.  The  technical 
approach  is  to  build  multivariate  models  for  estimating  reliability  and 
maintainability.  Data  from  twenty-one  Ada  subsystems  have  been  analyzed 
thus  far  to  test  hypotheses  about  various  design  structures  leading  to  failure- 
prone  or  unmaintainable  systems.  Current  design  variables  highlight  the 
interconnectivity  and  visibility  of  compilation  units.  Other  model  variables 
provide  for  the  effects  of  reusability  and  software  changes.  Reported  results 
are  preliminary  because  additional  project  data  is  being  obtained  and  new 
hypotheses  are  being  developed  and  tested.  Current  multivariate  regression 
models  are  encouraging,  explaining  60-80%  of  the  variation  in  error  density 
of  the  subsystems. 

Introduction 


A typical  shortcoming  of  large-scale  software  development  is  the 
uncertainty  concerning  the  consequences  of  design  decisions  until  much  later 
in  the  development  process.  Greater  capability  is  needed  during  the  design 
activity  to  assess  the  design  itself  for  indications  that,  when  implemented,  the 
resulting  system  will  have  particular  quality  characteristics.  This  paper 
discusses  the  early  experiences  in  a research  project  to  evaluate  the  quality  of 
Ada  designs. 

The  research  objective  is  to  test  the  hypothesis  that  Ada  software 
quality  factors  can  be  predicted  during  design.  The  technical  approach  is  build 
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multivariate  models  to  estimate  reliability  and  maintainability  using 
characteristics  of  the  design.  The  orientation  to  Ada  is  due  to  its  prevalence 
in  mission-critical  systems  under  development  and  its  ability  to  serve  as  a 
notation  for  software  design.  This  role  for  Ada  as  a design  language  has  been 
recognized  as  American  National  Standards  Institute  (ANSI) /Institute  of 
Electrical  and  Electronics  Engineers  (IEEE)  Standard  990-1987. 

Previous  research  has  established  relationships  between  design  or  code 
characteristics  and  quality  factors  [1,  2].  A recent  system-level  design 
measure,  incorporating  both  control  flow  and  data  flow  in  FORTRAN 
systems,  shows  a strong  correlation  with  reliability  [31.  The  Constructive 
QUAlity  MOdel  (COQUAMO)  is  being  developed  to  estimate  software  quality, 
basing  its  estimate  on  the  observed  quality  of  previous  projects  [4]. 

Quality  Estimation  Models 

Building  the  estimation  models  depends  on  having  access  to  three 
classes  of  project  data: 

- Design,  expressed  in  Ada,  from  which  design  characteristics  can  be 
extracted 

- Environmental  factors  that  influence  the  quality  of  the  software  but 
cannot  be  deduced  from  the  design  artifact  itself  - for  example,  level  of  reuse 
or  volatility  of  changes  to  the  software 

- Data  characterizing  what  resulted  when  the  design  was  implemented, 
tested,  and  fielded  — for  example,  reported  errors  and  effort  to  maintain  the 
software 

The  basic  form  of  the  estimation  models  is  shown  in  Figure  3. 
Independent,  explanatory  variables  in  the  models  represent  architectural 
design  characteristics.  Additional  explanatory  variables  account  for  the  effects 
of  the  organization  and  its  development  process.  By  the  error  term  in  the 
model,  we  will  learn  if  we  have  been  successful  in  explaining  the  variation  in 
quality  factors  by  using  design  characteristics  and  environmental  factors. 

Ada  Design  Representation 

One  of  the  first  issues  we  faced  was  developing  an  effective 
representation  from  which  we  could  extract  design  characteristics.  Our 
interest  was  in  the  static  architecture  of  units  in  subsystems,  not  in  the 
arrangement  of  statements  within  a unit.  We  viewed  the  subsystem  as  being 
composed  of  design  units  and  relations  as  illustrated  in  Figure  4.  Our  analysis 
of  Ada  identified  several  candidates  to  serve  as  design  units  in  our  structure: 
program  units,  compilation  units,  and  library  units.  All  three  units  have 
participated  in  our  model  building,  but  compilation  units  have  been 
particularly  useful  as  a structural  unit  because  they  also  serve  as  the  unit  of 
observation  for  reporting  errors  and  changes. 


Our  Ada  analysis  identified  fifteen  kinds  of  Ada  compilation  units: 
generic  package  specification,  generic  package  body,  and  so  on  as  shown  in 
Figure  5.  The  compilation  units  are  further  divided  into  library  units  and 
secondary  units  (see  Figure  5)  and  serve  as  the  design  unit  nodes  in  the 
graphical  Ada  design  structures  in  Figure  4.  The  nodes  are  related  to  one 
another  by  the  design  relations  of  context  coupling,  spedfication/body, 
parent/subunit,  and  generic  template/instantiation.  These  design  units  and 
design  relations  comprise  our  representation  of  static  Ada  architecture.  This 
Ada  design  representation  is  discussed  further  in  [5]. 

Software  Project  Data 

Project  data  used  in  the  analysis  is  summarized  in  Figure  6.  The 
twenty-one  subsystems  included  2,143  compilation  units.  Declarations  are 
listed  in  Figure  6 because  they  play  a key  role  in  the  hypotheses  we  are 
examining.  One  of  our  underlying  themes  is  that  a developer  does  not 
declare  objects,  types,  subprograms,  etc  unless  they  are  needed.  Thus,  the 
number  and  distribution  of  these  declarations  is  of  interest  to  us  in 
characterizing  designs. 

Our  models  attempt  to  explain  variation  in  quality,  and  Figure  6 shows 
our  project  data  exhibits  significant  variation.  The  data  was  obtained  from 
the  National  Aeronautics  and  Space  Administration  (NASA) /Goddard  Space 
Flight  Center  (GSFC)  Software  Engineering  Laboratory  (SEL).  Reliability  is 
measured  as  error  density  and  varies  in  the  range  1.4  - 17.0  errors  per 
thousand  source  lines  of  code  for  the  twenty-one  subsystems.  Maintainability 
varied  across  the  subsystems  as  26  - 89%  error  corrections  requiring  less  than 
or  equal  to  one  hour  to  complete. 

Hypotheses  About  Design  Structure 

We  are  pursuing  simple  hypotheses  about  design  decision  making,  die 
resulting  design  artifact,  and  the  influences  of  design  on  reliability  and 
maintainability  . Figure  7 outlines  an  example  of  a general  hypothesis  that 
excessive  context  coupling  contributes  to  errors.  The  rationale  is  that  greater 
arc  density  in  the  directed  graph  in  Figure  7 increases  the  likelihood  of 
introducing  an  error,  because  a greater  number  of  relationships  must  be 
understood. 

Figure  8 expands  on  library  unit  B of  Figure  7.  We  have  found  that  a 
library  unit  aggregation  — a library  unit  and  its  declarative  scope  — to  be  a very 
effective  unit  of  granularity  for  our  analysis  of  Ada  designs.  Figure  8 shows  a 
second  level  of  design  derision  making  that  occurs  inside  a library  unit 
aggregation.  We  are  interested  in  whether  the  designer  has  made  any  effect 
to  manage  the  visibility  of  the  103  declarations  that  have  been  imported  into 


unit  B.  By  having  100  of  the  declarations  brought  in  (via  a "with"  clause)  to 
the  specification,  they  are  visible  throughout  the  other  units  in  the  library 
unit  aggregation,  cascading  through  the  structure.  We  don't  know  which  of 
these  declarations  are  used  by  each  unit,  but  we  want  to  record  their  visibility 
to  the  other  units  in  the  library  unit  aggregation.  The  measure  of  cascaded 
imports  in  Figure  8 takes  visibility  into  account:  100  of  the  imports  are  visible 
to  five  units  ( =>  500  cascaded  imports)  and  three  of  the  imports  are  visible  to 
two  units  ( =>  6 cascaded  imports),  for  a total  of  506  cascaded  imports. 

Preliminary  Results  of  Statistical  Analyses 

Figure  9 summarizes  the  variables  that  have  been  introduced  into  our 
models  thus  far.  Context  coupling  and  visibility  follow  the  example  in  Figure 
8,  while  import  origin  records  the  fraction  of  declarations  imported  from 
within  the  subsystem.  Two  environmental  factors  have  been  analyzed  to 
date:  volatility  captures  the  relative  number  of  changes  that  have  been  made 
in  the  subsystem,  and  custom  code  is  the  percentage  of  new  and  extensively 
modified  code  used  in  the  subsystem.  Custom  code  is  essentially  the  opposite 
of  reuse. 

The  preliminary  model  explaining  variation  in  total  error  density 
(Figure  10)  includes  the  explanatory  variables  of  context  coupling,  visibility, 
and  volatility.  In  this  model  and  other  similar  regression  analyses  we  have 
conducted,  the  coefficients  have  the  expected  signs:  error  densities  increase  as 
context  coupling,  visibility,  and  volatility  increase. 

Because  of  our  interest  in  architectural  design  decisions,  we  conducted 
additional  regression  analyses  which  concentrated  on  errors  occurring  during 
system  and  acceptance  testing.  Our  rationale  was  that,  by  eliminating  errors 
reported  during  unit  testing  (and,  therefore,  more  likely  to  be  errors  in 
implementing  a single  unit),  we  were  reflecting  more  strongly  the 
architectural  (inter-unit)  relationships.  Figure  11  summarizes  a model  to 
estimate  errors  reported  during  system  and  acceptance  testing.  Again,  context 
coupling  and  visibility  are  included  as  explanatory  variables.  Now,  however, 
custom  code  is  a significant  factor  in  explaining  the  variation  in  error  density. 
The  explanatory  power  (as  indicated  by  the  coefficient  of  determination)  is 
stronger  for  the  model  in  Figure  11. 

Summary 

Early  results  in  building  estimation  models  for  reliability  and 
maintainability  are  encouraging.  We  have  developed  representations  for  the 
static  structure  of  Ada  systems  using  compilation  units  and  library  unit 
aggregations,  allowing  us  to  test  hypotheses  about  the  effects  of  different 
structures  on  reliability  and  maintainability  . Context  coupling  measures 
consistently  figure  strongly  in  the  multivariate  regression  analyses  we  have 
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conducted.  Visibility  and  import  origin  measures  provide  further 
refinement  The  models  show  strong  effects  of  volatility  and  custom  code  on 
reliability . 

We  stress  the  preliminary  nature  of  the  quantitative  results,  based  as 
they  are  on  twenty-one  Ada  subsystems.  We  look  forward  to  continuing  to 
explore  hypotheses  with  additional  data,  leading  to  the  development  of  more 
robust  models  that  can  be  subjected  to  validation. 
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Research  Project  Overview 


• Objective: 

- Test  the  hypothesis  that  Ada  software  quality  factors  can  be 
predicted  during  design 

• Technical  Approach: 

- Build  multivariate  models  to  estimate  reliability  and 
maintainability 

- Use  characteristics  of  the  software  design  captured  In  Ada  de*gn 
language 
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Basic  Form  of  the  Estimation  Models 
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Maintainability  a / (PC1§  DCa, -.EF^EF^-I  b^  bjt ._ ) + • 
wftara* 

DC  : dasign  ctiaractarfstle  vartaNt 

EF/  : anvtronmantal  factor  vartaMa 

a/ , b : modal  paramatara 

e : error  tann  (unaxplainad  variation) 


MITRE 


|Flgura3  1 


Representing  Ada  Design  Structure 
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"Parts"  in  Ada  Static  Structure 


15  Compilation  Units  in  Ada 
as  Library  Units  (L)  or  Secondary  Units  (S) 
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Profile  of  Current  Project  Data 


• Twenty-one  subsystems  from  NASA/GSFC  SEL: 

- interactive,  ground  support  software  for  flight  dynamics  and 
telemetry  processing  applications 

- 183  K non-comment,  non-blank  source  lines  of  Ada  (KSLOC) 

- 601  Lforary  Units 

- 2,143  Compilation  Units 

- 29,849  Declarations 

• Variation  In  dependent  variables: 

- Reliability  range:  14  - 17.0  errors/KSLOC 

- Maintainability  range:  26 -89%  "easy"  fixes  (requiring  si  hour) 
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Exploring  Simple  Hypotheses 
About  Design  Structure 


• Example  of  a general  hypothesis:  Excee alee  context  coupling 
contrtxAes  to  complexity  which,  In  turn,  crxttributee  to  amors 

• Example  of  context  coupling  to  access  the  resources  of  Ubrsry  units: 
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Inside  a Library  Unit  Aggregation  to  Show 
Imported  and  Exported  Declarations 


A Library  Unit  Aggregation 
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* Imports  3 103 

* exports  s 20 

* cascaded  Imports  s 506 
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Mode!  Variables 


•r 


• Design  Characteristics: 

- Coirtoxt  Coupling:  # Imports  / 8 exports 

- visibility:  $ cascaded  Imports/#  imports 

- import  Origin:  * Internal  Imports  / 1 Imports 

• Environmental  Factors: 

- Volatility:  # changes  / * library  units 

- Custom  Code:  % new  and  extensively  modified  code 
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Preliminary  Model  fot  Reliability: 
Total  Errors  (errtot) 


• Dependent  variable:  TOTERRSL  = errtot  / KSLOC 

In  (TOTERRSL)  = .65  + .27ln(Xi)  + .05  In  Qy  ♦ .27  In  (X^) 
(•36)*  (-11)  (-18)  (.08) 

X^  s context  coupling 

X2  = visibility  wflusted  r3s.72 

X_  = volatility 

* Standard  deviation  of  the  parameter  estimate 

Mms  [Figure  10< 
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Preliminary  Model  for  Reliability: 
System  and  Acceptance  Testing  Errors  (errsa) 


• Dependent  variable:  SYACERRSL  =>  errsa  / KSLOC 


In  (SYACERRSL)  = .77  + .18ln(Xi)  + .07ln(X2)  + 57ln(X3) 
(.65)*  (.18)  (51)  (54) 


X1  a context  coupling 

X2  a visibility  adjusted  RI * 3a  .78 

X,  a custom  code 
3 


* Standard  deviation  of  the  parameter  estimate 
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Current  Research  Activity 


• Continue  to  develop  process  models  and  hypotheses  about  design 
decision-making  and  design  structures  - and  their  relationships  to 
reliability  and  maintainability 

• Explore  classification  trees  and  other  alternative  analytical  methods 

• "Cal  for  Ada  Project  Data”  - to  test  hypotheses  and  calibrate 
multivariate  models 
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Research  Project  Overview 


• Objective: 

- Test  the  hypothesis  that  Ada  software  quality  factors  can  be 
predicted  during  design 

• Technical  Approach: 

- Build  multivariate  models  to  estimate  reliability  and 
maintainability 

- Use  characteristics  of  the  software  design  captured  in  Ada  design 
language 
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Basic  Form  of  the  Estimation  Models 
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Representing  Ada  Design  Structure 
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"Parts"  in  Ada  Static  Structure 


15  Compilation  Units  in  Ada 
as  Library  Units  (L)  or  Secondary  Units  (S) 
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Profile  of  Current  Project  Data 


• Twenty-one  subsystems  from  NASA/GSFC  SEL: 

- Interactive,  ground  support  software  for  flight  dynamics  and 
telemetry  processing  applications 

- 183  K non-comment,  non-blank  source  lines  of  Ada  (KSLOC) 

- 601  Library  Units 

- 2,143  Compilation  Units 

- 29,849  Declarations 

• Variation  in  dependent  variables: 

| j - Reliability  range:  1.4- 17.0  errors/KSLOC 
^ - Maintainability  range:  26  - 89  % "easy"  fixes  (requiring  < 1 hour) 


Figure  6 


CZi»tl  Ay 


Exploring  Simple  Hypotheses 
About  Design  Structure 


Example  of  a general  hypothesis:  Excessive  context  coupling 
contributes  to  complexity  which,  in  turn,  contributes  to  errors 

Example  of  context  coupling  to  access  the  resources  of  library  units: 


7 


1 

■ 

> 

B 

20 

Notation: 

□ 


Library  unit 


A imports 
20  declarations 
from  B 


If 

fjf  \ 

t 

D 

E 

B 

3 

100 

20 

B exports 
20  declarations 
to  A 


Example: 

with  B; 
package  A is 


end  A; 

Figure  7 
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Inside  a Library  Unit  Aggregation  to  Show 
Imported  and  Exported  Declarations 


A Library  Unit  Aggregation 


Static  Measures: 

# imports  = 103 

# exports  = 20 

# cascaded  imports  = 506 


Figure  8 


zzr>K**±i 


\ 


Model  Variables 


• Design  Characteristics: 

- Context  Coupling:  # imports /#  exports 

- Visibility:  # cascaded  imports  / # imports 

- Import  Origin:  # internal  imports  / # imports 

• Environmental  Factors: 

- Volatility:  # changes  / # library  units 

- Custom  Code:  % new  and  extensively  modified  code 
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Preliminary  Model  for  Reliability: 
Total  Errors  (errtot) 


• Dependent  variable:  TOTERRSL  = errtot  / KSLOC 

In  (TOTERRSL)  = .65  + .27ln(Xi>  + .05ln(X2)  + .27ln(X  ) 

(.36)*  (.11)  (.16)  (.08) 

X - context  coupling 

X2  = visibility  adjusted  RZ=  .72 

X = volatility 

“1 

* Standard  deviation  of  the  parameter  estimate 
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Preliminary  Model  for  Reliability: 
System  and  Acceptance  Testing  Errors  (errsa) 


• Dependent  variable:  SYACERRSL  = errsa  / KSLOC 


In  (SYACERRSL)  = .77  + .19  In  (X^  + .07  In  (X2)  + .97  In  (X3) 

(.65)*  (.18)  (.21)  (.24) 


X ^ = context  coupling 
X2  = visibility 
X = custom  code 


* Standard  deviation  of  the  parameter  estimate 
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Current  Research  Activity 


• Continue  to  develop  process  models  and  hypotheses  about  design 
decision-making  and  design  structures  - and  their  relationships  to 
reliability  and  maintainability 

• Explore  classification  trees  and  other  alternative  analytical  methods 

• "Call  for  Ada  Project  Data"  - to  test  hypotheses  and  calibrate 
multivariate  models 
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Abstract 

Implementation  bias  in  a specification  is  an  arbitrary  constraint  in  the  so- 
lution space.  While  bias  is  a recognized  problem,  it  has  not  been  studied  in  its 
own  right.  This  has  resulted  in  two  effects:  Either  (1)  specifications  are  biased, 
or  (2)  they  are  incomplete,  for  fear  of  bias.  In  fact,  what  has  been  called  '‘bias71 
in  the  literature  is  sometimes  the  desirable  record  of  design  constraints  and 
design  decisions. 

This  paper  presents  a model  of  bias  in  software  specifications.  Bias  is  defined 
in  terms  of  the  specification  process  and  a classification  of  the  attributes  of  the 
software  product.  Our  definition  of  bias  provides  insight  into  both  the  origin 
and  the  consequences  of  bias:  it  alsoshows  that  bias  is  relative  and  essentially 
unavoidable.  Finally  we  describe  current  work  on  defining  a measure  of  bias, 
formalizing  our  model,  and  relating  bias  to  software  defect. 

Keywords.  Implementation  bias,  software  design,  softT^re  defects,  require- 
ments, formal  specifications. 


1 Introduction 

Most  informal  software  specifications  are  ambiguous,  imprecise,  unclear,  incomplete  n 
etc.  Moreover,  this  is  usually  not  evident  by  looking  at  a particular  specification. 

‘This  research  is  supported  in  part  by  NASA  Goddard  Space  Flight  Center,  grant  NSG-5I23. 
t Additional  support  from  ODEPLAN  Chile  and  the  Catholic  University  of  Chile. 
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The  need  to  produce  better  specifications  has  prompted  research  in  several  fea- 
tures of  specifications.  Guttag  and  Horning  [2]  define  sufficient- completeness  and 
consistency  of  an  algebraic  specification  in  terms  of  existence  and  uniqueness  of  ax- 
ioms in  the  specification-  Jones  [3j  defines  bias  for  model-based  specifications  as 
the  property  of  nonuniqceness  of  representation  within  the  model.  Yue  [8]  gives  a 
definition  for  completeness  of  a specification,  in  terms  of  the  satisfaction  of  a set  of 
explicitly  stated  goals.  He  adso  defines  pertinence,  a property  related  to  bias.  Nicholl 
[5]  defines  the  concept  of  reachability  for  model-based  specifications  as  the  ability  to 
reach  every  consistent  state  by  some  sequence  of  operations,  and  plans  to  study  other 
features  of  specifications,  including  bias. 

This  current  research  work  grew  out  of  studies  within  the  Software  Engineering 
Laboratory  (SEL)  of  NASA  Goddard  Space  Flight  Center,  which  has  been  monitoring 
the  development  of  ground  support  software  for  unmanned  spacecraft  since  1976. 
Our  goal  is  to  improve  the  quality  of  software  specifications  within  the  SEL.  On 
the  realization  that  existing  specification  languages  were  inappropriate  for  use  by 
programmers  at  NASA,  we  developed  the  executable  specification  language  PUC 
(pronounced  POOK),  designed  to  be  used  with  Ada  in  this  environment  [7]. 

The  executability  of  specification  languages  like  PUC  had  the  disadvantage  that 
much  detail  had  to  be  included  in  specifications,  limiting  the  creativity  of  the  imple- 
mentor and  ruling  out  some  possibly  good  designs.  Hence,  instead  of  looking  at  the 
language  problem,  now  we  are  looking  at  this  problem  itself,  the  so-called  ‘implemen- 
tation bias1  in  specifications.  The  area  of  bias  in  specifications  is  largely  unexplored 
but  is  important.  In  fact,  the  problem  of  bias  is  mentioned  in  several  works,  including 
both  description  and  critiques  of  specification  methods. 

1.1  Definitions 

Some  related  concepts  are  defined. 

Attribute  An  attribute  of  a product  is  a required  or  desired  feature  of  the  product, 
its  environment,  or  its  development  process. 

We  use  ‘attribute1  instead  of  the  more  customary  ‘requirement1,  because  the 
latter  is  associated  with  mandatory  features  that  are  described  in  the  initial 
phases  of  development. 

Specification  A specification  of  a product  is  a description  of  a set  of  its  attributes. 

Under  this  definition,  both  the  requirements  document  and  the  preliminary  de- 
sign document  of  the  waterfall  model  are  specifications. 

Solution  set  The  solution  set  of  a problem  is  the  set  of  all  products  that  solve  the 
problem,  regardless  of  the  specification. 
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1.2  The  Problem  of  Bias 


An  ideal  specification  is  general  and  precise  enough  so  that  a software  system  satisfies 
the  specification  if  and  only  if  it  solves  the  problem  at  hand.  This  view  is  too  opti- 
mistic, because  there  can  be  many  solutions  to  the  real  world  problem  that  do  not 
even  involve  software.  In  practice,  we  only  need  that  software  systems  satisfying  the 
specification  be  solutions,  and  that  no  substantial  class  of  solutions  does  not  satisfy 
the  specification. 

A specification  is  biased  if  it  arbitrarily,  favors  some  implementations  over  others. 
Biased  specifications  can  overly  constrain  the  solution  set,  precluding  some  valid 
implementations  as  solutions  to  the  problem  at  hand.  Hence,  the  amount  of  bias  is  a 
common  yardstick  to  judge  software  specification  methods:  those  that  are  considered 
biased  are  usually  rejected. 

One  of  the  main  problems  of  not  having  a good  definition  of  bias  is  that  it  is 
sometimes  confused  with  intended  constraints  in  the  solution  set.  For  example,  a 
designer  may  want  to  favor  some  realizations  over  others  for  compliance  with  some 
programming  techniques  that  are  customary  at  that  site.  In  fact,  we  argue  that 
much  of  what  has  been  called  bias  is  simply  a manifestation  of  design  decisions,  that 
purposely  constrain  the  solution  set.  Of  course,  we  also  have  many  specifications  that 
are  indeed  biased. 

2 A Model  of  Bias 

We  present  a framework  to  discuss  bias,  based  on  a classification  of  the  attributes  of 
the  product  being  specified  aind  the  process  of  creation  of  attributes. 

2-1  Classes  of  Attributes 

We  classify  the  attributes  of  a product  with  respect  to  their  inclusion  in  the  specifi- 
cation. The  maun  criteria  we  consider  are  explicitness  and  origin. 

2.1.1  Explicitness 

An  attribute  is  explicit  if  it  is  present  in  the  specification;  otherwise,  it  is  nonexplicit. 
Nonexplicit  attributes  are  further  classified  in  four  classes. 

Implicit  attributes  are  those  that  are  understood  to  be  part  of  every  product  in  the 
application  domain,  auid  so  they  are  unstated. 

Implied  attributes  are  logicad  consequences  of  other  attributes. 

Absent  attributes  are  requirements  unintentionally  omitted  in  the  specification. 
These  are  not  part  of  every  product  in  the  application  domain. 
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Fictitious  attributes  [4]  are  not  attributes  at  all,  but  assumptions  made  by  the  * ** 

reader  of  the  specification:  the  reader  believes  that  they  are  either  implicit, 
implied  or  absent  attributes. 

2.1.2  Origin 

An  explicit  attribute  is  new  with  respect  to  a certain  specification  stage  if  it  is  first  v 

made  explicit  at  that  stage;  otherwise,  the  attribute  is  inherited  from  previous  stages. 

In  an  ideal  setting  all  attributes  new  in  a specification  stage  are  the  consequence 
of  design  decisions  taken  at  that  stage.  However,  nonexplicit  attributes  :r  the  previ- 
ous specification  usually  induce  the  specifier  of  the  current  stage  to  introduce  extra 
attributes.  Besides,  some  attributes  may  be  imposed  by  the  limitations  of  the  specifi- 
cation method  and  language  used.  This  motivates  the  following  classification  of  new 
attributes  with  respect  to  their  origin. 

Designed  attributes  are  the  consequence  of  design  decisions  taken  at  the  current 
specification  stage.  They  are  purposely  set  to  guide  the  implementation  process 
and  constrain  the  solution  set. 

Explicatory  attributes  are  created  by  making  explicit  attributes  that  are  implicit 
in,  implied  by,  or  absent  from  previous  stages. 

Imposed  attributes  are  those  imposed  by  the  limitations  of  the  specification  method 
and  language  used. 

For  example,  a method  may  accept  only  “complete”  specifications  (as  defined 
by  the  language),  which  leads  to  introduce  attributes  to  satisfy  the  rules  of  the 
language. 

Extraneous  attributes  are  created  by  making  explicit  fictitious  attributes. 

For  example,  a fictitious  attribute  seen  by  the  designer  in  a requirements  doc- 
ument may  introduce  explicit  constraints  in  the  design  document. 

2.2  The  Nature  of  Bias 

The  process  of  refining  successive  specifications  makes  explicit  attributes  that  were 
previously  implicit,  implied,  or  absent.  This  process  also  makes  explicit  design  de- 
cisions taken  at  the  current  stage.  Unfortunately,  it  also  makes  explicit  fictitious 
attributes  (i.e.,  creates  extraneous  attributes1)  and  creates  imposed  attributes  (Fig- 
ure 1).  This  leads  to  the  definition  of  bias  in  terms  of  the  origin  of  the  attributes 
described  in  a specification. 

1 Extraneous  attributes  lead  to  errors  and  constraints  in  the  solution  se*;  here  we  axe  studying 
only  the  constraints. 
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Figure  1:  Classification  of  attributes.  Fictitious  attributes  are  shown  with  segmented 
line,  because  they  are  not  real  attributes  but  misconceptions.  Dotted  lines  show  the  origin 
of  new  explicatory  and  extraneous  attributes. 

Definition.  A specification  containing  extraneous  or  imposed  attributes  is  biased. 

This  definition  provides  insight  into  the  problem  of  bias,  including  both  its  origins 
and  consequences.  The-origin  of  bias  is  either  wrongful  interpretation  of  nonexplicit 
attributes  or  the  limitations  imposed  by  the  specification  method.  The  consequences 
are  that  the  set  of  possible  solutions  can  be  overly  constrained  or  that  the  solution 
adopted  can  be  suboptimal.  That  is,  a biased  specification  will  lead  the  design  towards 
particular  implementations  that  are  not  necessarily  the  best  possible. 

Bias  content  in  a specification  cannot  be  measured  directly,  because  bias  is  defined 
in  terms  of  the  origin  of  attributes  which  is  usually  uncertain.  Furthermore,  bias  is 
relative  to  the  application  domain  and  the  software  engineering  environment,  because 
the  domain  and  environment  define  what  is  implicit. 

The  relative  nature  of  bias  is  an  essential  characteristic.  It  stems  from  the  existence 
of  nonexplicit  attributes  and  the  inherent  uncertainty  with  respect  to  those  attributes. 
As  long  as  there  are  nonexplicit  attributes,  there  will  be  doubt  about  these  attributes 
and  hence  possibility  of  bias.  Furthermore,  making  explicit  all  implicit  attributes  of 
a certain  domain  and  environment  still  leaves  two  sources  of  bias:  restrictions  on  the 
method  and  languages,  and  absent  attributes. 
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2.3  Example 

Assume  an  environment  in  which  all  programs  are  written  in  a particular  program- 
ming language.  In  this  environment  the  presence  of  idioms  of  this  language  in  a 
specification  is  not  necessarily  bias,  unless  another  implementation  language  is  intro- 
duced to  the  environment. 

This  is  what  happened  at  the  SEL  where  software  specifications  for  satellite  dy- 
namic simulators  were  •‘heavily  biased  toward  FORTRAN.  In  fact  the  high  level 
design  for  the  simulators  is  actually  in  the  specifications  document"  [1].  This  was  not 
a problem — on  the  contrary,  it  facilitated  both  development  and  reuse  of  specification 
and  code — until  the  first  development  in  Ada:  the  specifications  had  to  be  rewritten 
first. 

Given  our  definition  of  bias  these  FORTRAN-oriented  specifications  were  not  nec- 
essarily biased;  they  contained  many  designed  attributes.  Before  Ada  was  introduced, 
the  use  of  FORTRAN  was  implicit.  After  that,  the  language  used  had  to  be  decided: 
assuming  a FORTRAN  implementation  was  a fictitious  attribute. 

3 Current  Research 

We  are  improving  the  model  presented  in  this  paper  in  several  aspects. 

Formalization  One  weakness  of  our  model  as  presented  here  is  that  we  do  not 
formalize  the  concept  of  ‘attribute’.  Moreover,  we  define  ‘specification’  as  a set  of 
attributes,  disregarding  dependencies  among  attributes.  At  least  two  kinds  of  de- 
pendencies are  relevant:  attributes  defined  in  terms  of  other  attributes,  and  origin 
relationships  among  attributes. 

To  address  this  problem,  we  have  develped  a formalism  to  write  specifications  that 
is  flexible  and  extensible  enough  to  include  information  about  the  specification  itself 
(e.g.,  origin  information).  Within  the  system,  called  Extensible  Description  Formal- 
ism (EDF),  attributes  axe  defined  as  mappings  from  objects  to  values;  objects  are 
represented  by  extensible  polymorphic  records  whose  fields  are  the  attribute  names. 
EDF  can  represent  both  functional  dependencies  of  attributes  and  also  attributes 
defined  as  aggregations  of  several  attributes.  Origin  information  is  stored  by  repre- 
senting all  attribute  values  as  objects  that  have  an  origin  attribute  and  a content 
attribute. 

We  developed  a prototype  of  EDF  and  used  it  in  the  context  of  classification  of 
reusable  software  components.  We  axe  currently  developing  a complete  version  based 
on  a formal  specification  of  the  language  [6]. 

Measuring  Bias  In  this  work  we  have  not  provided  a characterization  of  biased 
specifications.  Because  of  the  relative  nature  of  bias  we  cannot  develop  a precise 
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metric  of  bias,  but  we  can  define  approximate  metrics,  based  on  origin  information 
explicitly  recorded  in  a specification. 

An  important  feature  of  EDF  is  that  it  is  possible  to  compare  two  specifications 
defining  some  distance  from  one  specification  to  another.  There  is  a predefined  com- 
parator function  to  estimate  the  adaptation  effort  in  the  context  of  reuse  of  software 
components,  and  it  is  possible  to  define  other  comparator  functions. 

We  can  measure  bias  comparing  the  distance  between  two  succesive  specification 
stages.  If  we  use  the  predefined  comparator  function  we  get  a gross  upper  bound  on 
bias  (as  if  all  attributes  new  to  the  second  stage  were  bias).  On  the  other  hand,  by 
defining  a comparator  function  that  uses  origin  information,  we  expect  to  provide  a 
reasonable  estimate  of  bias  introduced  in  the  second  stage. 

Bias  Propagation  Our  model  does  not  explain  how  bias  propagates,  because  we 
have  defined  bias  in  terms  of  new  attributes.  Strictly  speaking,  within  our  model  no 
inherited  attribute  is  bias.  Since  we  want  to  measure  bias  content  in  a specification, 
we  have  to  consider  those  attributes  whose  origin  include  extraneous  or  imposed 
attributes.  For  example,  if  a design  decision  is  taken  consistently  with  some  inherited 
attribute  that  was  extraneous  when  created,  then  this  decision  has  some  form  of  bias 
too. 


Bias  and  Software  Defects  Our  model  describes  the  origin  for  software  attributes, 
and  defines  bias  as  the  existence  of  some  attributes  with  ‘illegitimate’  origin.  The 
reader  can  realize  that  these  illegitimate  origins  are  also  the  cause  of  software  defects. 

Software  defects  are  classified  in  three  groups:  errors  are  conceptual  misunder- 
standings, favlis  are  concrete  (explicit)  manifestations  of  errors  in  documents,  and 
failures  are  manifestations  of  faults  during  execution. 

There  is  an  intimate  relashionship  between  errors  and  fictitious  attributes,  and 
between  software  faults  and  bias.  In  a sense,  bias  is  like  a very  minor  fault  that 
instead  of  leading  to  failures,  leads  to  inefficiencies.  The  consequence  of  this  is  that 
methods  to  avoid  bias  (e.g.,  making  explicit  implicit  requirements)  will  also  avoid 
software  defects. 


4 Conclusion 

Even  though  bias  is  widely  recognized  as  an  undesirable  property  of  specifications,  it 
has  not  been  adequately  studied.  This  has  caused  confusion  with  the  related  concepts 
of  design  constraint  and  design  decision,  so  that  the  presence  of  designed  attributes 
in  specifications  has  been  considered  undesirable.  This  is  in  contrast  with  the  use 
of  specifications  in  other  engineering  disciplines,  where  a specification  may  include 
many  designed  attributes  (e.g.,  materials,  manufacturing  methods). 
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In  this  paper  we  presented  a model  to  describe  the  nature  of  bias  and  distinguish  *' 

bias  from  designed  attributes  and  other  attributes  in  a specification.  This  model  is 
based  on  a classification  of  all  the  attributes  described  in  a specification  and  also 
those  that  are  not  described  (i.e.,  nonexplicit);  it  explains  the  nature  of  bias,  but 
since  it  uses  nonexplicit  attributes  it  does  not  lead  to  any  definite  method  to  detect  * 

bias.  However,  the  model  does  explain  both  the  relative  and  unavoidable  nature  of 
bias.  Moreover,  because  the  model  explains  the  origin  of  bias,  it  provides  insight  into 
bias  avoidance. 

Our  goal  is  to  improve  the  quality  of  the  specifications  by  removing  bias  and 
including  all  relevant  implementation-oriented  information.  To  achieve  this  goal  we 
need  to  tell  bias  from  designed  attributes.  This  requires  information  on  the  origin  of 
the  attributes,  which  is  usually  unknown.  Hence,  we  have  developed  a formalism  in 
which  origin  information  can  be  recorded,  as  a ^-neralization  of  the  common  prac- 
tice of  tracing  design  documents  and  actual  code  back  to  the  original  statement  of 
requirements. 
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Introduction 

Importance  of  specifications 


Life-cycle  models  consist  of  refinement  of 
succesive  specification  stages. 

Specification:  description  of  a set  of 
requirements. 

‘Requirement’  is  used  in  all  stages,  not  just 
the  first. 

Staged  specifications  imply 

• errors  in  previous  stages  are  costly 

• product  quality  depends  on  specification 

We  need  high  quality  specifications. 
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Introduction 

Solutions  vs.  specified  products 


U = product  universe 
A = acceptable  products  (solutions) 

S = specified  products 
S — A = specified  unacceptable  products 
A — S = solutions  not  specified 

Ideally:  S = A 

Needed:  S — A = 0 
Desired:  A - S is  small 
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' The  what  and  how  dilemma 

( 

i «■ 

i I 

l 

i Typical  rule  to  avoid  overspecification 

\ 

i r 

! Specify  what  the  system  should  do, 

I f not  how  to  do  it. 

i ' 

: i 

■ But  what's  and  how’s  depend  on  viewpoint, 

j What:  something  already  fixed 

I j-  How:  an  option 

i How’s  become  what’s. 


Confusion  creates  underspecification. 
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Classification  of  requirements 
Explicitness 


Requirements  of  a product  are  classified  as 


• Explicit:  written  in  the  specification 

• Inherited:  comes  from  previous  stages 

• New:  created  at  this  stage 

• Nonexplicit:  not  written 

• Fictitious:  not  a requirement,  but  a 
misconception 

• Implicit:  belongs  to  all  products 

• Implied:  consequence  of  other 
requirements 

• Absent:  unintentionally  omitted 
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Classification  of  requirements 
i Explicitness 


Classification  of  requirements 
Origin  of  new  requirements 


New  requirements  are  classified  as 

• Designed:  restriction  on  purpose 

• Imposed:  restriction  of  method  or 
language 

• Extraneous:  makes  explicit  a fictitious 
requirement 

• Explicatory:  makes  explicit  a-nonexplicit 
requirement 
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Classification  of  requirements 
Creation  of  new  requirements 
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Bias 
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The  nature  of  bias 
Definition 


Definition 


A specification  containing  extraneous 
or  imposed  requirements  is  biased . 


Origin  of  bias 

• wrongful  interpretation  of  nonexplicit 
requirements 

• limitation  imposed  by  method  or  language 

Consequences  of  bias 

• solutions  not  specified 

• adoption  of  a nonoptimal  solution 
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i ne  nature  ot  bias 
Essential  limitations 
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Bias  is  not  an  absolute  property. 
Bias  depends  on  origin. 


Bias  depends  on  application  domain  and 
environment,  because  of  different  implicit 
requirements. 

Bias  cannot  be  completely  eliminated. 


The  nature  of  bias 
Example  (at  NASA/GSFC) 


Before  introduction  of  Ada,  FORTRAN  was 
implicit. 

Specifications  had  many  FORTRAN-orientea 
requirements. 

During  first  Ada  project,  the  specifications 
had  to  be  rewritten. 

After  introducing  Ada,  assuming  FORTRAN 
was  a fictitious  requirement. 
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Conclusions 

^ Other  Considerations 


• Formal  definition  of  ‘requirement’. 

• Method  to  find  bias. 

• Formalism  to  write  specifications  with 
attributes  (e.g.,  origin  of  requirements). 
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Conclusions 

Contributions 


A theory  of  bias 

• classification  of  requirements 

• origin  of  requirements 

• precise  definition  of  bias 

• bias  is  inherent  to  specifications 

The  Extensible  Description  Formalism  (EDF) 
a language  to 

• describe  requirements  and  their  attributes 

• compare  specifications 

• measure  bias 

Bias  in  relationship  with  software  defects: 

• errors  <-►  fictitious  requirements 

• faults  bias 
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Conclusions 
Next  Steps 


Try  these  ideas  measuring  bias  in  a specific 
project. 

Extend  this  theory  to  explain  creation  of 
software  defects. 
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INTRODUCTION 

The  Software  Engineering  Laboratory  (SEL)  is  an  organization  sponsored  by  the 
National  Aeronautics  and  Space  Administration/Goddard  Space  Flight  Center 
(NASA/GSFC).  It  was  created  in  1977  to  investigate  the  effectiveness  of  software  en- 
gineering technologies  applied  to  the  development  of  applications  software.  The  SEL 
has  three  primary  organizational  members:  NASA/GSFC,  Systems  Development 
Branch;  University  of  Maryland,  Computer  Science  Department;  and  Computer 
Sciences  Corporation,  Flight  Dynamics  Technology  Group. 

Applications  developed  in  the  SEL  environment  are  used  primarily  to  determine  and 
predict  the  orbit  and  orientation  of  earth  orbiting  satellites.  There  are  many  similari- 
ties among  systems  developed  for  different  satellites,  and  those  similarities  create  a 
climate  in  which  software  reuse  enhances  efficiency  and  cost  effectiveness  in  the  devel- 
opment process.  Consequently,  reuse  has  always  been  an  important  SEL  priority. 
Over  the  last  several  years,  with  the  introduction  of  Ada  and  object-oriented  design 
(OOD)  techniques,  the  SEL  has  been  able  to  achieve  a significant  increase  in  the 
amount  of  software  reuse.  Figure  1 represents  graphically  the  increase  in  software 
reuse  on  recent  Ada  projects  as  compared  with  reuse  on  FORTRAN  projects  in  a simi- 
lar time  period  (Reference  1). 

Incorporated  into  all  SEL  development  is  the  Process  Improvement  Paradigm  (Refer- 
ence 2),  which  includes  the  following  four  steps: 

1.  Understand  and  characterize  the  current  environment. 

2.  Try  a candidate  improvement. 

3.  Measure  any  change  and  provide  feedback  on  experiences. 

4.  Adopt  candidate  improvements  with  favorable  results:  reject  those  with  un- 
favorable results. 

This  Ada  reuse  study  has  as  a primary  objective  the  first  process  improvement  step. 
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Figure  1 . Reuse  of  Simulator  Components 


This  paper  describes  the  analysis  performed  and  some  preliminary  findings  of  a study 
of  software  reuse  on  Ada  projects  at  the  SEL. 

It  describes  the  representations  used  to  make  reuse  patterns  and  trends  apparent.  The 
paper  focuses  on  those  aspects  of  the  analysis  that  are  applicable  to  other  environ- 
ments and  demonstrates  graphically  some  of  the  specific  patterns  of  reuse  studied  by 
the  SEL. 

THE  STUDY 

The  study  examined  software  components  (i.e.,  source  code  files)  of  three  general 
types:  those  that  were  developed  on  an  earlier  project  within  the  SEL  environment: 
those  that  were  acquired  from  an  external  source,  such  as  a public  domain  repository 
or  commercial  vendor;  and  those  that  were  adapted  from  a similar  component  on  the 
same  project.  The  reuse  information  in  the  SEL  data  base  (Reference  3),  combined 
with  the  source  code  files  and  design  documents  for  each  executable  program,  enabled 
the  study  team  to  trace  the  evolution  of  software  components  over  a 5-year  period  that 
included  several  generations  of  similar  applications. 

As  part  of  the  SELs  standard  data  collection  process  (Reference  4),  the  projects 
involved  in  the  study  had  recorded  each  component  of  the  programs  comprising  each 
system.  Each  component  was  classified  according  to  the  percentages  of  new  and 
reused  code  it  contained. 

For  those  components  considered  reused,  the  parent  project  (or  library)  was  identified 
and  recorded. 
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Reuse  Defined  by  SEL  Study 

This  SEL  Ada  study  focuses  on  the  reuse  of  source  code  files  obtained  from  existing 
projects  or  libraries.  Although  projects  in  the  SEL  apply  other  forms  of  reuse,  such  as 
specification  or  design  products,  those  other  forms  were  not  the  subject  of  this  study. 

During  development,  project  developers  classified  (Reference  4)  the  origin  of  each 
component  according  to  the  amount  of  new  versus  reused  code  it  contained.  The  high- 
est degree  of  reuse  was  for  components  that  were  reused  verbatim  (i.e.,  unchanged). 
The  next  degree  was  for  components  that  were  slightly  modified  (25  percent  or  less  of 
their  source  code  changed).  Finally,  the  lowest  degree  was  for  components  that  were 
extensively  modified  (more  than  25  percent  changed). 

The  following  types  of  components  were  excluded  from  the  definition  of  reuse  for  this 
study. 

• All  components  developed  from  scratch,  including  any  such  component  that 
may  have  contained  fragments  of  code  from  one  or  more  source  files  or  de- 
sign concepts  borrowed  from  existing  components. 

• Common  components  developed  for  a given  pro ject,  whether  used  un- 
changed multiple  times  in  a given  executable  program  or  in  multiple  pro- 
grams within  the  project. 

Ada  Projects  Studied 

The  study  included  nine  Ada  projects.  Three  projects  are  dynamics  simulators  that 
model  the  spacecraft’s  orbit  and  attitude  to  evaluate  attitude  control  algorithms. 
Three  projects  are  telemetry  simulators  used  to  generate  test  data  for  attitude  deter- 
mination software.  One  project  is  an  embedded  orbit  determination  system.  Another 
project  studied  developed  or  collected  components  for  a reuse  library.  Finally,  one 
project  developed  a tool  for  assembling  systems  from  reusable  components.  Figure  2 
shows  a timeline  of  the  nine  Ada  projects  studied. 

Representations  of  Reuse 

To  identify  patterns  of  reuse  over  time  or  within  the  software’s  architecture,  the  SEL 
study  created  a series  of  graphical,  textual,  and  combination  graphical/textual  repre- 
sentations of  component  origination  and  reuse  information. 

Reuse  Across  Projects  Over  Time 

One  group  of  four  reports  graphically  represents  software  reuse  by  multiple  projects 
over  time:  PROJECT  REUSE  SUMMARY,  PROJECT  REUSE  NETWORK, 
REUSE  FROM  LINEAGE,  and  REUSE  BY  LINEAGE. 

The  PROJECT  REUSE  SUMMARY  report  shows,  at  the  project  level,  the  number  of 
components  reused  from  each  project  or  library.  The  report  is  constructed  as  a matrix 
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Figure  2.  Ada  Projects  in  the  Flight  Dynamics  Division 

with  projects  producing  reusable  components  along  the  horizontal  axis  and  projecs 
consuming  reusable  components  along  the  vertical  axis.  Each  cell  gives  the  number  or 
components  obtained  from  a producing  project  and  reused  by  a consuming  projee. 
Two  versions  of  this  report  could  be  produced;  a detailed  report  (not  shown),  which 
gives  subtotals  for  each  of  the  three  degrees  of  reuse  defined  for  the  study,  and  a ton! 
reuse  report  (shown  in  Table  1),  which  gives  only  the  total  for  all  three  degrees  a 
reuse. 

The  PROJECT  REUSE  NETWORK  report  (Figure  3)  illustrates  reuse  as  a directed 
graph.  The  nodes  in  the  graph  represent  each  project, .while  the  fill  pattern  indicates 
the  type  of  application.  The  thickness  of  each  arrow  indicates  the  rough  order  of  mat- 
nitude  for  the  number  of  components  reused,  and  the  direction  of  the  arrow  indicates 
the  producing  and  consuming  projects.  In  this  report,  an  increase  was  evident  in  the 
amount  of  reuse  for  successive  generations  of  both  telemetry  simulator  and  dynamics 
simulator  applications. 

A REUSE  FROM  LINEAGE  report  focuses  on  the  origin  of  each  reused  component 
For  each  instance  of  a component’s  use,  the  project  name,  the  subsystem  name,  tie 
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Table  1 . Number  of  Reused  Components  by  Project 
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component  name,  and  degree  of  change  required  are  given.  The  lineage  history  of 
each  component  is  shown  by  indentation,  with  each  level  of  indentation  indicating  a 
prior  reuse  generation.  The  following  example  shows  the  lineage  of  the  project 
EUVEDSIM,  subsystem  SHEM’s  component  EAKTH_ATMOSPHERE. 

EUVEDSIM  SHEM  EAKTH_ATMOSPHERE  Reused  (Unchanged)  from 

GO  AD  A SHEM  EAKTH_ATMOSPHERE  Reused  (Extensively  modified)  from 

GRODY  TM  ATMOSB  New 

Note  that  the  names  of  the  original  component,  ATMOSB,  and  subsystem,  TM,  of 
GRODY  were  changed  by  GOA DA 

A REUSE  BY  LINEAGE  report  was  used  to  show  the  projects)  that  have  reused  each 
component  For  each  instance  of  a component’s  use,  the  project  name,  the  subsystem 
name,  the  component  name,  and  degree  of  change  are  given.  The  family  tree  of  each 
component  is  shown  by  indentation,  with  each  level  of  indentation  indicating  a subse- 
quent generation  of  reuse.  Using  this  report  and  focusing  on  the  degree  of  change 
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Figure  3.  Project  Level  Reuse  Network 


required,  the  study  staff  noticed  two  common  patterns.  The  first  pattern,  a sample  of 
which  follows,  is  for  components  that  implement  a general  solution. 

_component_name  (NEW)  Reused  by 

(UNCHANGED) 

(UNCHANGED) 

(UNCHANGED) 

The  second  partem,  shown  in  the  following  example,  was  seen  in  components  that 
have  incompletely  or  incorrectly  implemented  a general  solution. 

—component  name  (NEW)  Reused  by 

(—MODIFIED) 

(...MODIFIED)  3 

(...MODIFIED) 


Another  component  attribute  that  can  be  determined  by  using  the  REUSE  BY 
LINEAGE  report  is  the  component’s  domain  of  reuse.  Asstiming,  as  is  the  case  in  this 
study,  that  the  researcher  knows  the  application  type  (or  domain)  of  each  project,  the 
domain  of  reuse  can  be  derived  by  examining  the  type  of  projects  reusing  a given  com- 
ponent. 

Both  the  REUSE  FROM  LINEAGE  and  the  REUSE  BY  LINEAGE  reports  can  also 
be  useful  for  configuration  management  purposes,  by  identifying  the  projects  using  a 
given  component  If  one  of  the  projects  suggests  an  enhancement  or  correction,  the 
other  reusing  projects  could  also  be  notified. 

Reuse  Within  a Project’s  Architecture 

In  addition  to  examining  reuse  over  time,  this  study  also  examined  reuse  within  each 
project’s  architecture.  To  illustrate  the  findings  on  a system-wide  scale,  component 
reuse  was  superimposed  on  graphical/textual  representations  of  each  system’s  static 
structure,  its  calling  hierarchy,  and  its  compilation  order.  Architectural  representa- 
tions were  derived  from  the  source  code  and/or  design  documents. 

Each  representation  of  a project’s  architecture  requires  between  3 and  10  pages  of 
graphical/textual  hard  copy,  which  made  it  difficult  to  observe  the  overall  reuse  pat- 
terns. To  overcome  this  difficulty,  the  degree  of  change  was  color-coded  on  all  repre- 
sentations of  reuse  within  a project’s  architecture.  These  representations  could  then 
be  posted  on  a wall  and  the  color-coded  reuse  patterns  observed  from  a distance. 

Reuse  on  a Project’s  Static  Structure 

Components  were  organized  according  to  the  Ada  package  to  which  they  belong. 
These  packages  were  further  organized  according  to  the  logical  subsystems  defined  by 
the  original  developers.  The  degree  of  component  reuse  was  then  overlaid  on  the  sub- 
system and  the  resulting  representation  analyzed  for  patterns.  The  following  is  a sam- 
ple from  the  REUSE  ON  STATIC  STRUCTURE  report  for  the  GOADA  project. 
This  sample  shows  the  degree  of  reuse  in  one  package,  EPHEM_FILE_MANAGER, 
which  is  pan  of  the  SHEM  subsystem. 


Ada  Name  _ 

Component  Tvpe 

Decree  of  Reuse 

EPKEM  FTLE  MAN  AG  ER  . 

Package  Spec 

Unchanged 

. 

Package  Body 

Unchanged 

.CLOSE 

Procedure  Body 

Unchanged 

.COORDINATE  SYSTEM  OF 

Function  Body 

New 

.DATA  INTERVAL  OF 

Function  Body 

Unchanged 

.END  TIME  OF 

Function  Body 

Unchanged 

.initialize" 

Function  Body 

Unchanged 

.READ  EPHEM  POINT 

Procedure  Body 

Unchanged 

.REAL  TO  AC  TIME 

Function  Body 

Unchanged 

-START  TIME  OF 

Function  Body 

Unchanged 

Examination  of  this  representation  confirmed  our  intuition  that  in  most  cases  the 

granularity  of  reuse  was  library  units  (such  as  packages  or  standalone  procedures  or 
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functions).  The  granularity  of  reuse  for  telemetry  simulators  changed  c'-amatically  to 
the  entire  system  architecture  in  the  EUVETELS  project,  in  which  structure  and  al- 
most all  components  from  the  UARSTELS  project  were  reused  unchanged. 

The  study  examined  the  hypothesis  that  a component’s  interface  (i.e.,  its  Ada  specifica- 
tion) is  reusable  with  fewer  changes  than  its  implementation  (i.e.,  its  body).  In  general, 
the  hypothesis  was  confirmed  by  the  preceding  representation.  The  notable  exception 
was  the  second  generation  of  dynamics  simulators  (i.e^  the  GOADA  project),  in  which 
previously  large  packages  were  divided  into  smaller  packages.  In  that  case,  new  pack- 
age specifications  were  created  for  unchanged  or  slightly  modified  package  bodies. 

Another  hypothesis  examined  was  that  for  some  groups  of  components  (i.e.,  a package 
or  a subsystem),  there  are  some  parts  that  must  be  consistently  tailored  to  each  use. 
Although  parts  of  some  packages  or  subsystems  were  modified  in  each  successive  gen- 
eration, it  was  not  possible  to  distinguish  mission-tailored  parts  from  those  modified 
for  other  reasons. 

Another  hypothesis  examined  was  that  some  parts  of  an  application’s  architecture  lend 
themselves  to  reuse  more  than  do  others.  The  study  confirmed  the  hypothesis  by  re- 
vealing that  the  highest  concentration  of  consistently  reused  components  was  among 
those  implementing  basic  data  structures  and  mathematical  functions  or  operating  on 
standardized  data  files. 

Reuse  on  Project’s  Call  Tree 

The  static  analysis  of  subprogram  calls  in  each  component,  starting  with  the  main  sub- 
program, is  used  to  create  a call  tree.  The  call  tree  is  represented  textually  as  an 
indented  outline,  in  which  each  level  of  indentation  denotes  a level  of  nested  calls.  The 
order  of  subprograms  in  the  call  tree  reflects  the  order  in  which  each  subprogram  call 
appears  in  the  text  and,  for  sequential  code,  reflects  the  execution  order  of  the  calls. 
The  call  analysis  demonstrates  whether  reuse  occurs  predominantly  at  the  branch  or 
leaf-node  level  of  the  call  tree. 

Reuse  on  Project’s  Compilation  Order 

To  study  reuse  on  compilation  order,  the  study  group  generated  a textual  representa- 
tion combined  with  color-coded  graphical  elements.  Each  project’s  library  units  were 
ordered  in  a roughly  bottom-up  fashion  according  to  Ada  WITH  dependency  (i.e..  the 
units  with  fewer  WITH  dependencies  were  listed  first).  Component  reuse  was  then 
overlaid  on  the  library  units.  Examination  of  this  representation  revealed  that  the 
ability  in  Ada  to  separate  specification  from  implementation  was  effective  in  isolating 
higher  level  reused  components  from  extensive  changes  in  lower  level  components. 
Also  evident  using  this  representation  was  the  ability  to  reuse  Ada  generics  without 
change,  even  in  cases  where  their  functionality  included  entirely  new  capabilities  im- 
plemented by  generr:  subprogram  parameters.  A sample  from  a REUSE  ON 
PROJECT’S  COMPILATION  ORDER  report  is  shown  in  Figure  4. 
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GENERICHARDWARE 

WHEEL 

TORQUER 

GYRO  ANALOG  RATES 
TONS  "DOPPLER" 

GENERIC  MODEL  GYRO  ANGLES 
GYRO.ANGLES 
SCHEDULER 
SIMULATOR  DATABASE 
INITIAL  MRAMEIERS.REFORT 
MINOR  FRAME 
TELEMETRY  RECORD 
TELEMETRY SIMULATOR 


(Spec  - OLDUC  Body 
(Spec  - OUJUC) 

(Spec  - OLDUQ 
(Spec  - OLDUQ 
(Spec  - NEW) 

(Spec  - OLDUC  Body 
(Spec  - OLDUQ 
(Spec  - OLDUC  Body 
(Spec  - SLMOD,  Body 
(Spec  - OLDUC  Body 
(Spec  - SLMOD) 

(Spec  - OLDUQ 
( Body 


OLDUC  Subunit*  OLDUC  - 8) 


OLDUQ 

SLMOD.  Subunits  SLMOD  - 2.  OLDUC  - 4) 
SLMOD.  Subunit*  SLMOD  - 2) 

OLDUC  Subutna  SLMOD  • I) 


OLDUQ 


Legend:  OLDUC  is  unchanged 

SLMOD  b slightly  modified 
NEW  b developed  from  scutch 


Figure  4.  Reuse  on  Compilation  Order 


CONCLUSIONS 

Overall,  the  study  revealed  that  the  pattern  of  reuse  has  evolved  from  initial  reuse  of 
utility  components  into  reuse  of  generalized  application  architectures.  Utility  compo- 
nents were  both  domain-independent  utilities,  such  as  queues  and  stacks,  and 
domain-specific  utilities,  such  as  those  that  implement  spacecraft  orbit  and  attitude 
mathematical  functions  and  physics  or  astronomical  models.  The  level  of  reuse  was 
significantly  increased  with  the  development  of  a generalized  telemetry  simulator 
architecture. 

The  use  o£Ada  generics  significantly  increased  the  level  of  verbatim  reuse,  which  is 
due  to  the  ability,  using  Ada  generics,  to  parameterize  the  aspects  of  design  that  are 
configurable  during  reuse.  A key  factor  in  implementing  generalized  architectures 
was  the  ability  to  use  generic  subprogram  parameters  to  tailor  parts  of  the  algorithm 
embedded  within  the  architecture. 

The  use  of  object-oriented  design  (in  which  objects  model  real-world  entities)  signifi- 
cantly improved  the  modularity  for  reuse.  Encapsulating  into  packages  the  data  and 
operations  associated  with  common  real-world  entities  creates  natural  building  blocks 
for  reuse. 
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■ Background 

■ Representations  of  reuse 
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Definition  of  Reuse 


■ Ada  source  code  obtained  from  existing  projects 
or  libraries 

■ Each  source  file  (a.k.a.  component)  classified 
according  to  percent  of  lines  reused  without 
change 

■ Definition  does  not  include  other  forms  of  reuse 
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Potential  Benefits  of  Reuse 


With  high-level  reuse,  delivered  systems  can  be 

■ Delivered  sooner  and  at  lower  cost 

■ Incrementally  improved 

■ More  reliable 
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Flight  Dynamics  Environment 


■ Develop  similar  systems  for  different  satellites 

■ Knowledge  carried  between  missions 

■ Reuse  an  important  part  of  culture 

■ Economic  benefits  directly  related  to  amount  of 
code  reused  without  change 

■ Introduction  of  Ada  and  OOD  significantly 
increased  reuse  of  code 
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5 Projects  Using  FORTRAN 


5 Projects  Using  Ada  and  OOD 


■■  TOTAL  REUSE 
I I REUSE  UNCHANGED 
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Steps  in  Process  improvement 


1 . Understand  and  characterize  current 
environment 

2.  Try  candidate  improvement 

3.  Measure  change  - feedback  experience 

4.  Adopt  candidates  with  favorable  results; 
Reject  candidates  with  unfavorable  results 
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Goals  of  Current  Phase  of  Study 


■ Understand  and  characterize  reuse 

- Determine  patterns  and  trends  of  reuse 

- Determine  characteristics  distinguishing 
reused  from  non-reused  components 

■ Identify  candidates  for  reuse  library 

■ Identify  domain  of  component's  reusability  within 
the  environment 

■ Address  some  CM  issues  related  to  reuse 
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Questions  Addressed  by  Study 


Does  separation  of  Interface  specification  and 
implementation  affect  degree  of  change  required? 

Do  Ada  generics  improve  the  level  of  reuse 
without  change? 

Does  the  extent  of  intercomponent  dependencies 
affect  reuse? 

What  is  the  granularity  of  reuse? 

Where  in  software  architecture  does  reuse  occur? 

Do  patterns  of  component  evolution  suggest 
guidelines  for  more  effective  reuse? 
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Ada  Projects  in  the  Flight  Dynamics  Division 
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Data  Used 


■ Size  of  Study 

- 9 Ada  projects,  over  5 years 

- Over  1900  reused  components 

■ Input  Used 

- SEL  Component  Origination  Forms 

- Source  code  files 

- Representations  of  software  design 
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Representations  of  Reuse 


1.  Multi-Project  Reuse  Summary 

2.  Project  Level  Reuse  Network 

3.  Component  Lineage  Reports 

4.  Reuse  on  Software  Static  Structure 

5.  Reuse  on  Software  Call  Tree 

6.  Reuse  on  Ada  Compilation  Order 
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1.  Multi-Project  Reuse  Summary  Report 

Producer  Projects ... 

Total 

Reusable 

Total 

Components 

Consumer 

Projects 

• 

1 • 

■ Represented  as  a matrix  or  spreadsheet 

■ Identifies  producer  and  consumer  projects 

■ Identifies  number  of  components  reused 

■ Identifies  degree  of  change  required 

Total 

Reused 

CSC  s 

1 1 
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2.  Project  Level  Reuse  Network 


J*  Computer  Sciences  Corporation 
System  Sciences  Division 


01510(12) 


4 Is 


4 


* 


* 

i 


3T 


3.  Component  Lineage  Reports 


■ Represents  reuse  over  time 

- From  originating  project  forward 

- From  reusing  project  back  to  origin 

■ Identifies  parent  - child  relationship 

■ Identifies  components: 

- Implementing  general  solution 

- Generalized  incorrectly  or  incompletely 

- Domain  of  applicability 

■ Useful  for  CM  purposes 
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4.  Reuse  on  Software  Static  Structure 


■ Represents  reuse  at  project  level 

■ Reflects  developer's  logical  view 

■ Makes  visible: 

- Granularity  of  reuse 

- Project  dependent  parts 
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5.  Reuse  on  Software  Call  Tree 


■ Represents  reuse  at  project  level 

■ Reflects  actual  calling  hierarchy 

■ Makes  visible: 

- Level  of  functionality  reused 

- Location  of  reuse  within  architecture 
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6.  Reuse  on  Ada  Compilation  Order 


■ Represents  reuse  at  project  level 

■ Reflects  coupling  between  "Library  Units" 

■ Makes  visible: 

- Scope  of  change  required  to  reuse 

- Location  of  reuse  within  architecture 
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Reuse  Patterns  and  Trends  Observed 


■ Initially  application  independent  components 
reused,  now  majority  reflect  organization's 
problem  domain 

■ Ada  generics  significantly  increased  the  level  of 
verbatim  reuse 

■ OOD  (where  objects  model  real  world  entities) 
significantly  improved  modularity 
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Work  Remaining 


■ Develop  guidance  for  improving  verbatim  reuse 

■ Investigate  rationale  behind  characteristics  that 
distinguish  reusable  components 

■ Confirm  hypothesis  - Achieved  highly  reusable 
solution  for  Telemetry  Simulator  applications 
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REUSE  METRICS  AND  MEASUREMENT  - A FRAMEWORK  • 

Donald  J.  Reifer,  President 
Reifer  Consultants,  Inc. 

Torrance,  CA  90505 

Abstract:  This  presentation  will  describe  the  lessons  learned  and  experience 

gleaned  by  those  finis  which  have  started  to  Inplement  the  reuse  metrics  and 
measurement  framework  prepared  by  the  Joint  Integrated  Avionics  Working  Group 
(JIAWG)  for  use  In  controlling  the  development  of  common  avionics  and  software 
for  Its  affiliated  aircraft  programs  (e.g.,  the  Air  Force’s  Advanced  Tactical 
Fighter  (ATF),  the  Army’s  LH  helicopter  and  the  Navy’s  A- 12  fighter).  The 
framework  was  developed  to  permit  the  JIAWG  and  Service  System  Program  Offices 
(SPOs)  to  measure  the  long-term  cost/benefits  resulting  from  the  creation  and 
use  of  Reusable  Software  Objects  (RSOs).  The  framework  also  monitors  the 
efficiency  and  effectiveness  of  the  JIAWG’ s Software  Reuse  Library  (SRL). 

The  presentation  will  begin  by  defining  the  metrics  and  measurement  framework 
which  was  established  to  allow  the  following  six  determinations  and  findings 
to  be  made  relative  to  software  reuse: 

1.  Impact  of  RSO  creation  on  software  cost  and  productivity. 

2.  Impact  of  RSO  reuse  on  software  cost  and  productivity. 

3.  Impact  of  RSO  mining  on  software  cost  and  productivity. 

4.  Minimum  standards  of  quality  for  RSOs  as  they  enter  the  SRL. 

5.  Efficiency  and  effectiveness  of  SRL  usage. 

6.  Long-term  cost/benefits  of  SRL  usage. 

The  presentation  will  discuss  how  the  following  seven  criteria  were  used  to 
guide  the  establishment  of  the  proposed  reuse  framework: 

1.  Compatible  - The  framework  should  be  compatible  with  the  software 
processes  used  by  JIAWG  contractors  to  develop  avionics  software 
products  in  Ada  under  0oD-STD-2167A. 

2.  Ease  of  Data  Collection  - The  data  needed  to  quantify  the  metric 
should  be  easy  to  collect  and  normalize. 

3.  Ease  of  Understanding  - The  metrics  employed  should  be  easy  to 
understand,  analyze  and  interpret. 

4.  Minimum  Cost  - The  measurement  costs  (i.e.,  data  collection, 
analysis  and  reporting)  should  be  kept  to  a minimum. 

5.  Nonobtrusi ve  - Collection  of  metrics  data  must  not  adversely 
impact  the  processes  or  products  being  measured. 

6.  Objective  - It  should  be  difficult  to  bias  or  distort  the  value  of 
the  metric. 

7.  Predictive  - The  metric  should  facilitate  generation  of  accurate 
estimates  of  software  cost,  productivity  and  quality. 
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Next,  object  recapture  and  creation  metrics  will  be  explained  along  with  their 
normal ized  use  in  effort,  productivity  and  quality  determination.  A single 
and  multiple  reuse  instance  version  of  the  popular  C0C0M0  cost  model  will  be 
presented  which  employs  these  netrics  and  the  measurement  scheme  proposed  by 
the  Software  Productivity  Consortium  (SPC)  to  predict  the  software  effort  and 
duration  under  various  reuse  assumptions.  Investigations  in  using  this  model 
to  predict  actuals  taken  from  the  RCI  database  of  over  one  thousand  completed 
projects  will  be  discussed  along  with  statistical  findings. 

User  experience  with  this  metrics  and  measurement  framework  as  part  of  the  Air 
Force’s  Reusable  Ada  Avionics  Software  Package  (RAASP)  and  Avionics  Fault- 
Tolerant  Software/Ada  Technology  Insertion  Program  (AFTS/ATIP)  projects  will 
be  discussed  next.  The  lessons  learned  with  these  metrics  by  these  projects 
will  be  siaaarlzed.  These  two  projects  are  conducting  controlled  experiments 
to  capture  measurement  data  that  provides  Insight  into  those  factors  which 
Impact  software  cost,  quality,  productivity  and  system  srformance.  The  RAASP 
effort  Is  focusing  on  determining  the  relative  impact  of  object-oriented 
methods,  reuse  paradigms  and  SRI  operational  policies  software  productivity, 
cost  and  quality.  AFTS/ATIP  Is  assessing  the  Impact  of  a large  number  of 
process  and  product  factors  on  overall  cost  and  system  performance. 

The  presentation  will  conclude  with  a summary  of  key  points.  Recommendations 
will  be  presented  to  help  those  embarking  on  a reuse  program  to  improve  their 
measurement  and  prediction  capabilities. 
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PURPOSE 
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Describe  the  reuse  metrics  and  measurement  framework 
created  by  J1AWG  to  make  the  following  determinations: 

• Impact  of  RSO  acquisition  on  software  cost  and 
productivity 

• Impact  of  RSO  reuse  on  software  cost  and 
productivity 

• Minimum  standards  of  quality  for  RSOs  entering  the 
Software  Reuse  Library  (SRL) 

• Efficiency  and  effectiveness  of  SRL  usage 
• • Long-term  cost/benefits  of  SRL  usage 

Discuss  implementation  of  the  framework  on  the  OSS  and 
RAASP  projects 


b 


Reusable  Software  Object  fRSOt  - life  cycle  products  developed 
to  be  reused  (designs,  algorithms,  code,  tests/test  cases,  etc.) 


Al  mtfcriafc  oowW  by  RCI  Nut  lot*  reproduced  without  prior  wrftlen  conwnl. 
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PRODUCTIVITY  IMPROVEMENT  STRATEGIES 
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* Productivity  must  be  measured  from  a quality  viewpoint 
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BARRIERS  TO  REUSE 


RCI-TN-470 


• Lack  of  incentives 

• Few  standards 

• Limited  tool  support 

• Champion  needed 

• Multiple  quality  levels 

• NIH  bias 

• Needed  infrastructure  changes 

• Few  quantitative  metrics 

? 

Source:  RCI  Reuse  Survey,  8/89 


Al matoftato  copyright  by  RCI.  Nol  to b*  ftpiodootd  without  prior  tpwlunconMot. 


7 of  If 


Page  4 


METRICS  SELECTION  CRITERIA 
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• Compatible  with  DOD  processes 

• Ease  of  data  collection 

• Ease  of  understanding 

• Minimum  measurement  cost 

• Objective  and  unbiased 

• Predictive  of  the  future 

• Unobtrusive  as  possible 
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OBJECT  ACQUISITION  RATIO 
n 

OAR  = 2 (Wj)  (aj/Aj) 
i=1 

where:  aj  = no.  of  RSOs  acquired 
per  collection 

Aj  ?=  no.  of  objects  in  that 
collection 

n ® no.  of  collections 

wj  = weighting  factor  for 
each  collection 

n 

and  2 wi  = 1*.  aj/Aj  > 0 
1=1 

Al mAUrbk oopyri ght by RCI.  Nd  lob* leprodoood wkhuA prior wilKn conwol. 
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OBJECT  REUSE  RATIO 


n 

ORR  = 2 (Wj)  (rj/Rj) 
i=1  • 

where:  q = no.  of  reused  objects 
in  a collection 

Rj  = no.  of  objects  in  that 


D.  Retfer 
RCI 
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REUSE  MECHANIZATION 
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Collection 

Wj 

• Requirements 

0.20 

• Design 

0.30 

• Source  code 

0.20 

• Tests/test  cases 

0.30 

• OARe  ■ (0.2)(a-jn  + a-jr)/A^  + (0.3)(a2n  + &2p  + a2r)^2  + 

(0.2)(a3n  + a3p  + a3r)/A3  + (0.3)(a4n  + a4p  + a4r)/A4 

where:  axn  = newly  created  objects 
aXp  «=»  purchased  objects 
axr  = recovered  objects 

• ORR0  - (0.2)  r,/Rj  + (0.3)  r^  + (°-2)  r3/R3  + (°-3)  r4/R4 

where:  rx/Rx  = reuse  ratio  for  a collection 


Al maiwWi oojjyngM by RGI.  Nol lobe lepiodoned wihoul ptiof  wiilen tonaiil, 
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METRICS  USAGE 
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REUSE  VERSION  COCOMO  (SINGLE  INSTANCE) 


I 


Effort,.  » c (I  + (OARx)(b18)  - (ORR0)(b19))  Effort 

where:  c = adjustment  factor  for  domain 

b18  = RSO  cost  factor  (0.10  < b^8  < 0.36) 
big  = RSO  benefits  factor  (0.20  < big  < 0.60) 

OARx  = expanded  form  of  OAR 
ORRe  » effective  form  of  ORR 
Effort,.  » cost  in  staff-months  with  reuse 
Effort  - cost  in  staff-months  (COCOMO) 

OARx  = (0.2)(a,n  + (0.5)  a,^  + (0.3)(a2n  + (0.2)  a2p  + 

(0.4)  a2r)/A 2 + (0.2)(a3n  + (0.2)  a2p  + (0.5)  ag^/Ag  + 
(0.3)(B4n  + (0.3)  a^p  + (0.8)  Q4f)/A4 
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FACTOR  RATINGS 
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LOW 

NOMINAL 

HIGH 

VERY  HIGH 

EXTRA  HIGH 

Reuse 

Cost 

Factor 

<bi8) 

0.10 

0.17 

0.26 

0.31 

0.36 

Limited 

reuse 

packaging 

Design  and 
code  RSO 
reuse 
packaging 

Full  RSO 
reuse 
packaging 

Domain 
specific  RSO 
reuse 
packaging 

Extensive 

reuse 

packaging 

(synthesis) 

Reuse 
' Benefits 
E » Factor 

* (Dig) 

I 


LOW 

NOMINAL 

HIGH 

VERY  HIGH 

EXTRA  HIGH 

0.20 

0.25 

0.34 

0.48 

0.60 

Planned 

reuse 

Systematic 

reuse 

Managed 

reuse 

Institutionalized 

reuse 

(within  and 
across  jobs) 

Optimized 

reuse 

(domain 

specific) 

Al material! oopyrigM by  RCI.  Hollo U raprodooad without prk* wrlten content, 
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METRICS  USAGE 

RCI-TN-470 


SPC  MODEL 


Cost  = (1-R)  + R(B  + E/N) 

where:  B - relative  cost  to  reuse  RSO 
R o proportion  of  reused  software 
E » cost  to  develop  RSO 
N - number  of  reuses 


REUSE  COCOMO  MODEL  (MULTIPLE  INSTANCES) 


Effortm  = c(1  , (OARx)  (b1fi)  - (ORR0)  (b19))  Effort 

m 

where:  c = calibration  coefficient 

m = number  of  reuses  (m  > 1) 
b^  = cost  factor  (0.10  < b-jg  < 0.36) 

b^  = benefit  factor  (0.20  < b-jg  < 0.60) 

OARx  - Object  Acquisition  Ratio  (average) 

ORRe  - Object  Reuse  Ratio  (average) 

AX  maUoate  copyright  by  RCI.  Not  to  be  reptotk*^  wKhoul  prior  wt  ton  consent. 


«i*>n  *»«j 


THE  OSS  EXAMPLE 


Page  10 


RCI-TN-470 


Al matiMiili copyr iflhl by RCI.  Nodob* i*ptoduo*<l wilhou prk* will*n consent. 
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QUALITY  METRICS 
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FACTOR 


CRITERIA 


• Correctness 

• Efficiency 

• Maintainability 

• Portability 

• Testability 

• Usability 


>•  Clarity 

Coupling  Strength 
Independence  - 
■>  Modularity 
>-  Self-descriptiveness 
>•  Simplicity 


METRICS 


iojndep 


Y t 

task-indep  machjndep 


I 


I 


softjndep  physjimjndep 


I softjndep  = no_sys_dep_mod  + no_impl_dep_pragmas  I 
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LIBRARY  EFFICIENCY 
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FACTOR 

CRITERIA 

METRICS 

• Efficiency 

• Effectiveness 


■>»  Average  service  time 
System  response  time 
>•  System  throughput 
>»  Resource  utilization 
>>  Workload  characteristics 


w P 

°l 


y 


Cumulative  number 
of  times  SRL  browsed 


Cumulative  number 
of  times  RSO  retrieved 


Cumulative  number 
of  times  SRL  searched 


Al imMtWi  copyright  by  RCI:  Notlob*  rrpnxHtcid  wdhotn  prior  wrilon  conwril; 


turn  **4 


Page  13 


FACTOR  | 

• Efficiency 

• Effectiveness 


££  Number  of 

users/month 


Al  meterUb  oopyttoM  by  RCI.  Not  lo  be  reproduced  wKhoti  prior  mlten 
* 


IDS 

•*»#»■ -a 


$ 


4 


* 


UH  f Ij 


LONG-TERM  COST/BENEFITS 


NET  PRESENT  WORTH 


T 

NPW  = £ (Bt)  (1/(1  +i)>) 

t=0 


RCI-TN-470 


Acquisition  $ 

Adaptation 

Documentation 

Infrastructure 

Training 

COSTS  $ 


Administration  $ 

Maintenance 

Operations 

COSTS  $ 


• Cost  avoidance 

$ 

• Added  capability 

• Reduced  cost 

of  quality 

• Cost  savings 

BENEFITS 

$ 

I 'll  II  m ■ 

• Better  customer 

satisfaction 

$ 

• Fitness  for  use 

BENEFITS 

$ 

M iiwMMi  by  RGI,  Net  19  b*  WOduGKl  wktou  pilot  mm  teottnl. 
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RAASP  ADAPTATION 
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HYPERTEXT  LIBRARY  EFFICIENCY 
AND  EFFECTIVENESS 


• No.  of  objects  In  library 

• No.  of  links  traversed/hit 

• No.  of  items  browsed/hit 

• Amount  of  time  for  a hit 

• No.  of  log  in’s  per  user 

• Amount  of  time/user  session 

• No.  of  objects  withdrawn/user  session 

• No.  librarian  actions/object 

• No.  of  objects  submifted/month 

• No.  of  objects  withdrawn/month 

• No.  of  SPRs/object/month 

• No.  of  SCRs/object/month 


Usage  Profiles 

• By  object 

• By  service 

• System-wide 


Al  nriittil^f  oupyilgU  by  HUf.  NuOa  t*  uptuJimJ  wJlftuut  wt l*fi  cut  t«ml. 
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SUMMARY  AND  CONCLUSIONS 
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• We've  described  the  JIAWG  software  reuse  metrics  and 
measurement  framework 

• We've  described  the  pilot  implementation  of  the  framework 
on  OSS  and  RAASP 

• We’ve  discussed  our  multiple  instance  reuse  version  of  the 
COCOMO  model 

• Needed  to  explore  the  economics  of  reuse 

• We've  just  touched  the  surface  of  the  issues  involved 

• Your  thoughts,  feedback  and  help  are  solicited  especially  if 
you  have  "hard"  data  to  share 
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Cost  and  Quality  Planning  for  Largo  NASA  Programs 


The  Software  Cost  and  Quality  Engineering  methodology  developed  over  the  last  two  decades  at  IBM 
Federal  Sector  Division  (FSD)  in  Houston  is  used  to  plan  the  NASA  Space  Station  Data  Managenrst 
System  (DMS).  An  ongoing  project  to  capture  this  methodology,  which  is  built  on  a foundation  . T 
experiences  and  “lessons  learned/’  has  resulted  in  the  development  of  a PC-based  tool  that  integrates  cos? 
and  quality  forecasting  methodologies  and  data  in  a consistent  manner.  This  tool  Software  Cost  and 
Quality  Engineering  Starter  Set  (SCQESS),  is  being  employed  to  assist  in  the  DMS  costing  exercises.  At  the 
same  time,  DMS  planning  serves  as  a forcing  function  and  provides  a platform  for  the  continuing,  iterative 
development,  calibration,  and  validation  and  verification  of  SCQESS.  The  dan  that  forms  #>e  cost  and 
quality  engineering  database  is  derived  from  more  than  17  years  of  development  of  NASA  Space  Shuttle 
software,  ranging  from  low  criticality,  low  complexity  support  tools  to  highly  complex  and  highly  critical 
onboard  software. 


INTRODUCTION 

Software  cost  and  quality  engineering  is  the  systematic  approach  to  the  estimation,  measurement,  and 
control  of  software  costs  and  quality  on  a project.  This  discipline  provides  the  vital  fink  between  the 
concepts  of  economic  analysis  and  the  methodology  of  software  engineering.  The  tasks  involved  in  software 
cost  and  quality  engineering  are  complex,  and  individuals  with  the  knowledge  and  skill  required  are  scarce. 
The  accuracy  and  consistency  of  the  results  are  often  questionable.  There  is  a definite  need  for  tools  to 
enable  analysis  by  managers  and  planners  who  are  not  experts  and  to  improve  the  results. 

There  are  many  instances  in  planning  software  development  activities  when  a quick  and  easy  to  use  cost  and 
quality  estimating  tool  would  be  of  value.  These  include  management  consulting,  proposal  generation,  and 
analysis  of  existing  programs  for  problem  correction  or.  assurance.  There  is  little  lime  to  learn  a complex 
tool  or  to  digest  introductory  user  information.  SCQESS  requires  no  set-up,  has  a selectable  demonstration, 
direct  access  to  the  tool  functions,  and  contains  examples  of  vaned  applications. 

SCQESS  resides  on  a tool  sharing  disk  which  is  available  to  all  company  sites.  The  user  has  the  option  of 
viewing  a demonstration  program  to  illustrate  the  cost  and  quality  estimation  process.  SCQESS  includes  a 
Lotus- based  cost  and  quality  estimation  spreadsheet.  The  spreadsheet  inrludrs  cost  and  quality  models 
which  are  based  on  historical  data  and  various  criticality  levels.  If  different  models  are  required,  the  existing 
mode!*;  can  be  easily  modified.  An  example  Lotus  spreadsheet  illustrates  a completed  estimation-  This 
spreadsheet  can  be  modified  or  a blank  file  is  supplied  if  the  work  is  entirely  new.  The  tool  can  handle 
estimates  involving  many  languages  including  Ada  and  can  also  be  used  for  estimating  reused  elrments. 

Once  the  basic  cost  and  quality  estimates  are  completed,  the  user  executes  a Rayleigh  Curve  program  to 
phase  the  estimates  over  time.  A Rayleigh  curve  is  a plot  of  a mathematical  function  which  describes  Hfc 
cycle  phenomena.  A Rayleigh  curve  indicates  whether  the  slope  of  the  staffing  curve  is  too  steep  or  whether 
the  error  density  of  the  project  is  too  great  at  certain  points  in  the  process. 

The  cost  and  quality  estimate  can  then  be  quickly  modified  to  determine  variance  of  results  based  on 
changes  in  assumptions.  The  cost  and  quality  estimates  support  the  strength  of  the  plan  being  generated. 
The  estimates  can  be  used  to  analyze  risk  and  mitigation  plans.  Examples  of  actual  project  cost  and  quality 
reports  are  available  to  the  user.  These  examples  include  a project  involving  Ada.  reuse  and  commercial 
elements;  a project  involving  translation  of  code  from  one  machine  architecture  to  another  and  a transaction 
oriented  commercial  system.  The  spreadsheets  and  the  Rayleigh  curves  are  included  in  the  report. 
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Software  Life  Cycle  Costing  and  Quality  Estimation  Methodology 

Past  experience  in  managing  large  software  projects  such  as  Space  Shuttle  Onboard  Software  illustrates  that 
accurate  cost  and  quality  estimates  based  on  reliable  historical  data  are  essential  to  software  planning.  PSD 
Houston  has  collected  extensive  data  from  NASA  and  other  software  development  projects  over  the  last  IT 
years.  This  data  includes  source  lines  of  code,  productivity,  error  rates,  computer  usage,  esc.  for  mom  than 
250  projects.  Historical  data  supports  initial  estimation  of  project  size  and  estimation  of  effort  from  that 
size.  The  standard  Rayleigh  curve  model  converts  estimated  labor  to  a schedule  and  staffing  profile  ~ the 
elements  of  a cost  plan.  It  also  projects  error  estimates  across  the  schedule  to  generate  a quality  plan. 

The  widely  used  Rayleigh  curve  models  the  typical  build-up  of  staff  and  errors  during  the  requirements  and 
design  phases,  the  peak  for  implementation,  and  the  tail-off  during  the  testing  phase.  The  staffing  schedule 
for  an  ideal  project  approximates  a Rayleigh  curve.  During  sustaining  engineering,  a minimum  le^el  ci 
critical  dn!U  is  required  for  effective  maintenance.  This  steady-state  staffing  level  forms  the  support  lime,  h 
includes  critical  skills  for  requirements,  design,  implementation,  testing,  and  management.  The  support  lire 
is  a function  of  system  size  and  productivity  as  well  as  unique  skill  requirements  specific  to  the  ‘oerwaze 
being  maintained. 

The  areas  below  the  support  line  and  above  the  maintenance  tail  of  the  Rayleigh  curve  is  available  foe  near 
development  work.  Sustaining  engineering  operational  increment  also  corresponds  to  a Raykigh  curve. 
Each  smudning  engineering  effort  can  be  modeled  as  the  sum  of  a sequence  of  such  curves.  The  sizing  and 
scheduling  of  new  development  activities  should  be  planned  to  provide  a stable  level  of  effort.  Sacnrae 
maintenance  which  handles  Discrepancy  Reports  can  continue  at  a lower  support  level. 

Historical  project  data  supports  software  size,  labor,  and  quality  estimation.  The  Locus-based  Matrix 
Method  function  distributes  the  effort  and  errors  over  organizational  elements.  The  Rayleigh  Curve  fumcticc 
generates  a staffing  and  error  discovery  profile  over  time.  Special  models  are  available  in  other  packages 
adjust  estimates  involving  expert  systems,  reusable  software,  reconfiguration,  quality  tracking,  zzd 
maintenance. 


Planning  the  Space  Station  DMS  Utilising  SCQESS 

SCQESS  has  been  used  to  assist  in  the  costing  of  the  Space  Station  Data  Management  System  (D\1S),  x 
complex  software  system  involving  a distributed  environment  with  multiple  languages  and  applications.  The 
DMS  for  Space  Station  is  also  affected  by  the  requirements  for  long  lifetime,  permanent  operations,  remote 
integration,  and  phased  technology  insertion  of  productivity  tools,  applications,  expert  systems,  etc.  Major 
cost  and  quality  drivers  include  the  large  size  and  diversity  of  the  software,  complexity,  development  stopped 
environment,  off-the-shelf  and  reusable  software,  and  criticality,  which  varies  from  one  module  to  another 
An  example  of  the  type  of  results  — at  the  end  of  the  intermediate  step  of  development  cost  and  onaiirr 
estimation  — obtained  with  SCQESS  for  the  DMS  planning  is  included  in  the  presentation. 


Summary/Condusions 

The  software  cost  and  quality  engineering  methodology  employed  at  IBM  FSD  Houston  has  been  capture: 
and  integrated  into  a prototype  tool,  SCQESS.  This  PC-based  tool  integrates  cost  and  quality  plamnnx 
methodology  and  data 

SCQESS  has  been  employed  to  assist  in  the  cost  and  quality  planning  of  the  Space  Station  DMS  (Dan. 
Management  System).  It  is  providing  a standardized  approach  for  the  DMS  planning,  which  involves  severa 
individuals.  It  has  made  the  process  more  efficient  and  has  allowed  a consistent  approach  to  planning-  Tie 
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automation  and  captured  methodology  has  established  the  foundation  and  mechanism  enabling  the 
continuing  calibration  and  improvement  in  accuracy  and  consistency  for  Space  Station  DMS  costing. 
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MODEL  RECONCILIATION 


ALT  (EARLY  MODEL) 

PROJECTED  « 10429  MM 

DEVELOPMENT  PART  OF  ALT  = 9403  MM  ACTUALS 
ERROR  IS  1026  MM  OR  11%  HIGH 

STS-1  (MIDDLE  MODEL) 
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ERROR  IS  616  MM  OR  6%  LOW 

STS-2  THRU  STS -5  (MIDDLE  MODEL) 

PROJECTED  > 5864  MM 

DEVELOPMENT  PART  OF  STS-2  THRU  STS-5  = 5994  MM  ACTUALS 
ERROR  IS  130  MM  OR  2%  LOW 


TOTAL 
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PROJECTION  COMPARISON 
(INCLUDING  PROCESS  DATA) 
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Abstract 

In  this  paper  we  present  the  results  of  an  experimental  study  under- 
taken to  assess  the  improvement  in  program  quality  by  using  formal  spec- 
ifications. Specifications  in  the  Z notation  were  developed  for  a simple 
bat  realistic  anti-missile  system.  These  specifications  were  then  used  to 
develop  2 versions  in  C by  2 programmers.  Another  set  of  3 versions  in 
Ada  were  independently  developed  from  informal  specifications  in  Engfisk. 
A comparison  of  the  reliability  and  complexity  of  the  resulting  programs 
suggests  the  advantages  of  using  formal  specifications  in  terms  of  number 
of  errors  detected  and  fault  avoidance. 


* Finding  during  the  course  of  this  project  was  provided  via  US  Army  contract  no.  DAKF 
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EXTENDED  ABSTRACT 


Specification  languages  are  widely  accepted  as  a stepping  stone  for  design 
and  development  of  a complex  software  system  [1, 2, 3t  4, 5].  The  advantages  of  a 
specification  language  are  often  not  immediately  clear  in  terms  of  program  qual- 
ity and  reliability.  Proving  an  executable  program  correct  for  complex  systems 
is  computationally  an  intractable  task  [6].  Also  "an  effective  testing  strategy 
which  is  reliable  for  ail  programs  cannot  be  .onstructed”  [7].  In  such  a setting, 
formal  specification  languages  coupled  with  structured  design  methodologies  [8] 
provide  a streamlined  approach  for  software  design  and  development. 

In  this  experimental  investigation,  we  study  the  effect  of  the  specification 
language  Z [11]  on  program  reliability  and  complexity.  For  our  experiment  we 
chose  the  NASA  Laundi  Interceptor  Problem(LIP)  since  it  has  been  used  ex- 
tensively for  several  other  studies  in  software  reliability  and  fault  tolerance.  It 
is  a simple  but  realistic  representation  of  an  anti-missile  system.  The  original 
specifications  were  taken  from  Knight  and  Leveson  [12].  The  LIP  is  a constraint 
satisfaction  problem,  a solution  to  which  is  a decision  procedure  which  takes  a 
set  of  input  points  and  launch  characteristics  to  evaluate  a set  of  initial  launch 
conditions,  called  the  preliminary  unlocking  matrix.  The  procedure  then  eval- 
uates a logical  combination(the  combination  is  decided  by  an  input  matrix)  of 
the  initial  conditions  called  the  final  unlocking  vector  the  components  of  which 
collectively  decide  if  the  launch  signal  should  be  true  or  otherwise. 

The  experiment  consisted  of  usual  phases  of  software  design  and  develop- 
ment with  minor  differences.  In  the  specification  phase  a set  of  specifica:xons  of 
the  requirements  was  developed  in  the  Z notation  and  was  validated  by  other 
specifiers.  Several  versions  were  developed  based  pn  informal  and  formal  require- 
ments specifications  separately,  by  independent  groups  of  programmers.  For 
testing,  a hybrid  approach  [13]  was  developed  based  on  functional  and  struc- 
tural information  about  the  LIP.  For  generating  test  cases,  the  hypothetical 
launch  conditions  were  divided  into  7 relatively  independent  groups.  The  truth 
values  of  one  of  the  groups  was  fixed  a priori,  and*  an  input  data  set  was  con- 
structed to  satisfy  the  prefixed  truth  values  of  t'  a group  and  the  truth  values 
of  the  rest  wer^  computed  against  the  input  set.  Such  manually  designed  test 
cases  were  used  to  test  each  program.  A: * .r  debugging,  when  the  computations 
of  launch  conditions  for  all  the  versions  match,  the  cyclomatic  complexity  mea- 
sure [9]  is  applied  to  compute  internal  complexity  of  each  individual  module. 
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Also  computed  are  the  external  complexity  due  to  the  interconnections  between 
various  modules  based  on  ” information  flow”  concepts  (10],  and  finally  the  total 
system  complexity  as  a weighted  sum  of  internal  and  external  complexities. 

The  versions  based  on  inform  ad  requirements  are  found  to  be  afflicted  with 
usual  problems  caused  by  the  inherent  ambiguities  in  the  informal  requirements. 
However,  a significant  reduction  was  observed  in  the  number  of  errors  detected 
in  the  testing  phase  in  case  of  the  versions  based  on  formal  requirements.  Fur- 
ther, complexity  measures  strongly  suggest  that  versions  based  on  formal  spec- 
ifications are  less  complex  and  more  reliable  than  these  based  on  informal  ro- 
quirements.  The  study  also  suggests  that  the  formal  specifications  developed 
through  several  successive  stages  of  operations  refinement  lend  themselves  to  an 
automatic  modular  program  development(special  case  of  a divide  and  conquer 
technique)  in  an  optimal  way,  and  thus  reduce  the  error-proneness  of  the  pro- 
gram and  make  it  more  reliable. 


Summary  of  Experimental  Results 
L Productivity: 

Table  1 - specification  development  time 

Version  number  | Total  Specification  Development  Time(hours) 
Spec  I 47 

Table  2 - program  development  time 

Version  number  Total  Program  Development  Time( hours) 
Cver  I 18 

Cver  II  38 

Adaver  I 76 

Adaver  II  73 

Adaver  III  89 

IL  Reliability  (in  terms  of  number  of  errors  detected) 


Table  3 - Number  of  errors  detected  during  development 


Version  number 
Cwl 
Cver  II 
Adaver  I 
Adaver  II 
Adaver  III 


Total  Number  of  Errors 

3 
8 
8 
7 

4 
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Tablt  4 - Ntim6cr  of  errors  detected  during  testing 


Version  number 

Total  Number  of  Errors 

Cver  I 

0 

Cver  II 

7 

Adaver  I 

13 

Adaver  II 

11 

Adaver  III 

8 
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OBJECTIVES  OF  STUDY 


• Investigate  the  effect  of  using  formal  specifications  on 

• productivity 

- reliability 

- complexity 


• Compare  results  with  versions  developed  from  informal 
specifications 
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EXPERIMENTAL  APPRAOCH 
Used  NASA  - Launch  Interceptor  Problem  (LIP) 


Developed  Z-specifications  from  English  specifications  of 
LIP  (Two  independent  Z specifications) 


Used  Z-specs  to  develop  3 indipendent  versions  in  C 


Each  version  tested  for  a set  of  54  test  cases  from  a previous 
experiment  involving  LIP 


Each  version  executed  for  one  million  random  test  cases  to 
simulate  operational  testing 


LIP 


• Simple,  but  realistic  anti-missile  system. 


• Studied  elsewhere41  in  connection  with  fault-tolerant  and 
Fortran/Ada  comparison  software  research 


• Program  reads  inputs  which  represent  radar  reflections, 
checks  whether  some  prespecifled  conditions  are  met  and 
determines  if  the  reflections  come  from  an  object  that  is  a 
threat  and  if  yes,  signals  a launch  decision 


* Knight  and  Leveson,  IEEE-TSE,  January  1986. 
Goel,  etal,  COMPSAC  87  and  RADC-TR-88-213. 


A.  God 

Synam  L'arr. 

Pat*  U<*2S 


EXAMPLE 


Launch  Interceptor  Conditions 


LIC  1:  There  exists  at  least  one  set  of  two  consecutive  data 
points  that  are  a distance  greater  than  LENGTH  1 
apart 


0 < = LENGTH1 


LIC  11:  There  exists  at  least  one  set  of  three  data  points 
separated  by  exactly  E and  F consecutive 
intervening  points,  respectively,  that  are  the  vertices 
of  a triangle  with  area  greater  than  AREA1 

1<E 

1 < F 

E + F < NUMPOINTS  - 3 
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Z-SPECIFICATION  LANGUAGE 


Well  known  specification  language  developed  by 
Programming  Research  Group  at  Oxford  University 


Has  been  applied  to  develop  specifications  for  several 
software  systems  but  we  are  not  aware  of  experimental 
results  comparing  it  with  informal  approaches 


SOME  COMMENTS  ON  Z FOR  LDP 


Z-specifications  were  helpful  in  several  aspects. 

Some  Examples: 

• In  resolving  certain  ambiguous  issues 

• whether  two  identical  (x,  y)  pairs  can  belong  to  a sequence 
of  input  data  points 


• In  expressing  invariant  properties 

- the  LCM  matrix  is  symmetric,  can  be  easily  expressed 
mathematically 


• In  exploiting  the  repetitiveness  of  certain  launch  conditions 
which  was  helpful  in  functional  groupings  for  design  and 
testing. 

- a closer  look  at  LIC  1, 8 and  13  indicates  that  they  are 
related.  We  exploit  the  similarity  by  defining  a 
"prototype"  schema,  and  then  using  it  to  define  each  of 
these  separately 
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Expressing  Requirements  in  the  Z 

Notation 

Example:LICl 

Informal  Specification 

LIC  1:  There  exists  at  least  one  set  of  two  consecutive 
data  points  that  are  a distance  greater  than  LENGTH1  apart. 
(LENGTH1  >0) 

Formal  Specification 


Z/Cl(NI/MPO/NTS,Z£iVGrtfl| 

POINTS  :seq  RxR 

cmv,  cmv ' : N"  — > B 

emu'  = emu© 

{1  ~ (1  < #{(POINTS(i),POINTS(j))\Vi,j  : I..NUM POINTS • 
j = i + 1 A {LENGTH  1 < cdist{POI  NT  S{i),  POI  NT  S{j)))) 
ALENGTHl  > 0)} 

where  ed.ist{p,q ) computes  the  distance  between  points  p 
and  q. 


c=-& 
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Expressing  Requirements  in  the  Z 

Notation 

Example:LIC7 

Informal  Specification 

LIC  7:  There  exists  at  least  one  set  of  N_PTS  consecutive 
data  points  such  that  at  least  one  of  the  points  lies  a dis- 
tance greater  than  DIST  from  the  line  joining  the  first  and 
last  of  these  points.  If  the  first  and  last  points  of  these  N_PTS 
points  are  identical,  then  the  calculated  distance  to  compare 
with  DIST  will  be  the  distance  from  the  coincident  point  to  all 
other  points  of  the  N.PTS  consecutive  points.  (DIST  >0) 

Formal  Specification 


JjCTifnJWPOTNfS,  NpTS,  DIST]  " " 

POINTS : seq  Rx  R 

cmvycmv' : M-+8 

emu'  = emu© 

{7  ~ (1  < # { ( POINTS(i),  POINTS(j))\Vi,  j : I..NUM POINTS* 
j = i + N.FTS  — 1 A 3*  : : + l..j  - 1* 

(ptjcmp(POINTS(i),  POINTSU )) 

A {cdist{POINTS{i),POINTS(k))  > DIST)) 
V{^pt.cmp(POINTS(i),  POINTS{j)) 

A{pdist{POINTS{i),  POINTS(j),  POINTS(k))  > DIST))} 

AjD/sr  > o)} 

where  edistfjp , q)  computes  the  distance  between  points  p 
and  q , pdist(j> , q,  r)  computes  the  perpendicular  distance  from 
point  r to  the  line  through  p and  q and  ptjcmp(p,  q)  returns  a 
boolean  value  true  if  p and  q are  identical,  and  otherwise  false 
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Expressing  Requirements  in  the  Z 
Notation(contcL) 

Example:LIC7 


pdist : 'll2  x V}  x K7  -*  TV 

Vp, qtr€lV  • 

-,(pt.cmp(p,q))  =>  pdist(p,q)  = 

Note  that  the  line  must  be  well  defined,  i.e,  at  least  the  points 
on  the  line  must  not  be  identical.  Obviously  this  is  a partial 
function. 
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SOME  PROGRAM  METRICS 


Programmer 

C-Code  From 
Z-Specs 

Ada  Code  From 
Informal  Specs 

A 

B 

C 

D 

E 

F 

Source  Lines 

373 

407 

669 

691 

624 

851 

Comment  Lines 

82 

80 

59 

59 

126 

251 

System  Complexity* 

56 

53 

81 

334 

309 

297 

* See  Lew  et  al,  TSE,  November  1988. 
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COMPLEXITY  METRIC* 


• System  S has  n modules,  each  with  complexity  Mj 


• System  complexity  s VlMt 

• M|  depends  oo 

• Internal  complexity 

• External  complexity  (measures  module  interrelationships) 

• Internal  complexity 

• McCabe’s  cydomatic  number 


• External  complexity 

- Amount  of  interaction  with  the  environment 

- Depends  on  the  information  content  of  input  and  output 

parameters 


* Lew  et  al,  IEEE-TSE,  November  1988,  pp.  1645-1655. 
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PROGRAM  DEVELOPMENT  EFFORT  (hours) 


Versions 

Develop 

Z-Specs 

Design 

Coding 

Testing 

Total 

A 

27 

6 

6 

6 

45 

B 

10* 

10 

10 

8 

38 

c 

33 

8 

6 

4 

51 

* B used  specs,  developed  by  A 


Learning  Z:  A-20hrs. 

C - 21  hrs. 
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NUMBER  OF  ERRORS* 
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COMPARISON  OF  DATA  FROM  C AND  ADA  VERSIONS 


• We  compared  the  effort  and  error  data  from  a previous 
experiment  that  used  Fortran  and  Ada  languages. 


• We  do  not  think  that  our  results  are  biased  because 
language  dependent  aspects  are  not  under  study  here.  Also, 
the  programmers  in  these  studies  were  reasonably  proficient 
in  the  respective  languages  so  that  the  choice  of  the  language 
should  not  affect  our  results 


• However,  to  enhance  our  conclusions,  we  plan  to  develop  C 
versions  from  informal  specifications 
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COMPARISON  OF  EXPERIMENTAL  RESULTS:  EFFORT  AND  ERRORS 


r 

4 * X 

i 

1 


Programmer 

z 

Informal 

A 

B 

C 

D 

E 

F 

Effort 

45 

32 

51 

76 

73 

89 

D&UT 

3 

1 

3 

5 

4 

4 

Errors 

FT 

0 

7 

0 

8 

7 

4 

Total 

3 

8 

3 

13 

11 

8 

D&UT  - Development  and  Unit  Testing 
FT  - Function  Testing 


) 
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FAULT  AVOIDANCE  BY  USING  Z 


• We  believe  that  certain  types  of  faults  can  be  avoided  by 
using  formal  specifications 


• Following  are  two  explicit  examples  of  faults  avoided  by 
using  for  LIP* 

- Caluclation  of  angle  between  n and  2k  rather  than  between 
0 and  x 

- Calculation  of  distance  from  point  to  line  when  points  are 
collinear  and  first  point  not  between  other  two 


* See  Brilliant  et  al,  TSE,  February  1990,  page  242. 
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FAULT-AVOIDANCE  - EXAMPLE  LIC  7 
• Consider  3 collinear  points  (A,  B,  C)  as  shown 

9 

ABC 


• Need  to  compute  distance  from  B to  line  AC  (LIC  7) 


• Computation*  from  informal  specs  can  lead  to 
Dist(A,  C,  B)  = min  (dist(A,  B),  dist(B,C)) 


• However,  formal  specifications  always  compute  zero,  the  correct 
result 


See  Brilliant  et  al,  TSE,  February  1990,  p.  242. 
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CONCLUDING  REMARKS 


. ^ ► 


• Use  of  Z specifications  was  clearly  helpful  in  reducing  errors 
(and  hence  increasing  reliability) 


• Based  on  a few  metrics,  it  is  also  evident  that  the  complexity 
of  code  developed  from  Z was  also  lower 


• Total  effort  involved,  including  learning  Z and  development 
of  Formal  specifications,  was  comparable  to  the  effort 
involved  in  developing  versions  from  informal  specifications 


Yet  — 


• This  experiment  does  not  provide  conclusive  evidence  about 
the  superiority  of  formal  specification  over  informal  ones 


• Further  investigation  necessary  to  explore  the  feasibility  and 
usefulness  of  Z for  large  problems 


• Reusability  of  such  formal  specifications  also  needs  to  be 
investigated 
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ABSTRACT 

Software  inspection  is  a technical  evaluation  process  for  finding  and 
removing  defects  in  requirements,  design,  code  and  tests.  Software 
inspections  have  been  used  by  a wide  variety  of  organizations  for 
improving  software  quality  and  productivity  since  their  original 
introduction  at  IBM.  The  Jet  Propulsion  Laboratory  (JPL),  California 
Institute  of  Technology,  tailored  Fagan's  original  process  of  software 
inspections  to  conform  to  JPL  software  development  environment  in 
1987.  However,  the  fundamental  rules  of  the  Fagan  inspection 
process  were  adhered-to. 

Detailed  data  was  collected  during  the  first  three  years  of  experience 
at  JPL  on  203  inspections.  Statistics  are  discussed  for  this  set  of 
inspections.  Included,  on  a per  inspection  basis,  are  averages  of:  staff 
time  expended,  pages  covered,  major  defects  found,  minor  defects 
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found  and  inspection  team  size.  The  Inspection  team  size  varied  from 
three  to  eight  participants  with  the  JPL  Product  Assurance 
Organization  providing  most  of  the  moderators. 

Analysis  of  variance  (alpha  = 0.05)  showed  a significantly  higher 
density  of  defects  during  requirements  Inspections.  It  was  also 
observed,  that  the  defect  densities  found  decreased  exponentially  as 
the  work  products  approached  the  coding  phase. 

Increasing  the  pace  of  the  inspection  meeting  decreased  the  density 
of  defects  found.  This  relationship  was  observed  to  hold  for  both 
major  and  minor  defect  densities,  although  it  was  more  pronounced 
for  minor  defects. 

This  paper  provides  guidelines  for  conducting  successful  software 
Inspections  based  upon  three  years  of  JPL  experience.  Readers 
Interested  in  the  practical  and  research  implications  of  software 
Inspections  should  find  this  paper  helpful. 

INTRODUCTION 

This  paper  describes  an  analysis  of  factors  influencing  the  defect 
density  of  products  undergoing  software  inspections.  Software 
intensive  projects  at  the  Jet  Propulsion  Laboratory  (JPL)  require  a 
high  level  of  quality.  JPL.  a part  of  the  California  Institute  of 
Technology,  is  funded  by  NASA  to  conduct  its  unmanned 
Interplanetary  space  program.  Software  inspections  were  Introduced 
at  JPL  in  1987  to  Improve  software  quality  by  detecting  errors  as  early 
in  the  developmental  lifecycle  as  possible. 

Software  Inspections  are  detailed  technical  reviews  performed  on 
intermediate  engineering  products.  They  are  carried  out  by  a small 
group  of  peers  from  organizations  having  a vested  Interest  in  the  work 
product.  The  basic  process  is  highly  structural  and  consists  of  six 
consecutive  steps:  planning,  overview,  preparation.  Inspection 
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meeting,  rework  and  follow-up.  The  Inspection  process  Is  controlled 
and  monitored  through  metrics  and  checklists.  One  of  the  best 
fundamental  descriptions  of  this  process  is  Fagan's  original  article 
[Fagan.  1976). 

JPL  tailored  Fagan's  original  process  to  improve  the  quality  of  the 
following  technical  products  of  a software  Intensive  system:  Software 
Requirements,  Architectural  Design,  Detail  Design,  Source  Code.  Test 
Plans,  and  Test  Procedures.  For  each  of  these  types  of  products  a 
checklist  was  tailored  for  JPL's  application  domain,  standards  and 
software  development  environment.  Supplemental  tailoring  included 
the  addition  of  a "third  hour"  step  to  Fagan's  process.  The  "third  hour" 
step  was  first  discussed  by  Glib  [Glib.  1987).  JPL's  "third  hour"  step 
includes  time  for  team  members  to  discuss  problem  solutions  and 
clear-up  open  issues  raised  In  the  inspection  meeting.  Other  tailoring 
included  substantial  use  of  Software  Product  Assurance  personnel  as 
inspection  moderators,  a JPL  specific  training  program,  and  new  data 
collection  forms. 

The  analysis  of  defect  densities  from  inspections  was  performed  for 
the  purpose  of  1)  ensuring  that  the  conditions  of  a quality  inspection 
are  being  met  by  JPL  inspections.  2)  verifying  previous  research 
findings  on  inspections  and  3)  understanding  the  factors  which 
influence  inspection  results.  It  was  expected  that  the  results  would 
agree  with  previous  findings  on  inspections,  but  due  to  slight 
variations  in  the  variables  collected,  some  differences  were  observed. 

Methods 

Data  was  collected  on  203  Inspections  performed  on  five  software 
intensive  projects  at  JPL,  Practically  all  inspection  team  members 
were  trained  in  a one  and  a half  day  course  on  formal  inspections 
[Kelly.  1987).  Software  Product  Assurance  supplied  70%  of  the 
moderators.  The  inspections  took  place  between  February  1987  and 
April  1990.  Although  the  projects  used  Ada,  C and  Simula,  only  16% 
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were  performed  on  code.  Table  1 shews  the  types  of  Inspections 
performed  in  this  study  and  the  sample  size  for  each  type. 

The  data  included  in  this  study  was  recorded  on  the  "Formal 
Inspection  Detailed  Report"  and  "Inspection  Summary  Report"  forms 
(Appendix  1 and  2].  Each  inspection  produced  a complete  set  of 
forms  indicated  in  the  process  diagram  [Figure  1].  This  Information 
was  placed  into  a database  and  monitored.  Occasionally,  the  chief 
moderator  would  contact  the  inspection  moderator  when  reports 
were  abnormal.  This  was  done  to  provide  feedback  for  inspections 
which  were  experiencing  difficulties.  Eleven  inspection  reports  were 
rejected  for  analysis  in  this  sample  for  the  reason  that  they  violated 
some  of  the  fundamental  rules  of  inspections  as  shown  in  (Appendix 
3(. 

Checklists  were  used  to  1)  help  Inspection  team  members  focus  on 
what  to  look  for  in  a technical  work  product,  and  2)  provide  categories 
for  classifying  defects.  A generic  checklist  was  provided  in  the 
training  materials  for  each  type  of  Inspection:  Rl,  10.  II.  12.  IT1  and 
IT2.  Projects  may  use  the  generic  checklist  or  tailor  this  list  to  match 
their  own  environment  and  development  standards.  However,  we 
encouraged  projects  to  maintain  the  15  main  categories  for  types  of 
defects  shown  in  the  "Formal  Inspection  Detailed  Report".  (Appendix 
11. 

The  metrics  used  to  monitor  and  analyze  inspections  can  be  classified 
into  three  prime  areas:  staff  time,  types  of  defects  and  workproduct 
characteristics.  The  staff  time  expended  was  recorded  by  total  hours 
during  each  stage  of  the  inspection  process.  Part  way  through  our 
study  we  began  collecting  staff  time  by  the  role  played  in  the 
inspection  meeting  (author,  moderator,  or  inspector).  The 
organizational  areas,  represented  by  these  participants,  were  also 
recorded. 

Each  defect  was  classified  by  severity,  checklist  category,  and  "type". 
The  severity  of  defect  was  classified  either  major,  minor,  or  trivial. 
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Trivial  defects  in  grammar  and  spelling  were  noted  and  corrected,  but 
not  included  in  this  data  analysis  nor  on  the  "Formal  Inspection 
Detailed  Report"  {Appendix  1]. 

The  "type"  of  defect  (mission,  wrong,  or  extra)  was  recorded  on  the 
forms,  but  not  In  the  database.  This  information  is  not  as 
institutionally  critical,  however,  the  authors  find  It  to  be  a useful  guide 
during  the  rework  stage  of  the  process. 

The  workproduct  characteristics  Included  size  (by  pages  of  lines  of 
code),  phase  and  type  of  product  (requirements,  test  plan,  etc.),  and 
project  Since  inspections  were  usually  Introduced  relatively  early  In 
the  developmental  lifecycle,  when  most  products  were  technical 
documents,  the  preferred  size  reporting  metric  was  in  pages.  A 
typical  page  of  JPL  documentation  is  single  spaced.  38  lines  per  page 
in  10  point  font  size.  A page  containing  a diagram  was  counted  equal 
to  a page  of  test.  The  authors  felt  that  number  of  pages  was  a more 
accurate  measure  of  material  undergoing  Inspection  than  "estimated 
lines  of  code"  for  technical  documents,  since  projects  did  not  have  a 
history  of  a detailed  accounting  of  the  second  metrics  during  the  early 
lifecycle  phases.  Due  to  most  of  the  data  being  reported  in  pages, 
different  relationships  are  found  than  in  previous  studies.  It  should  be 
noted  that  "pages"  is  more  of  a producer  oriented  statistic  than  it  is  a 
product  oriented  measure.  One  of  the  key  metrics  that  was  used  in 
this  analysis  is  "density  of  defects  per  page".  This  metric  was  used  to 
compare  inspections  of  different  types  and  their  related  factors. 

ftCT1*** 

Results  showed  a higher  density  of  defects  in  earlier  lifecycle  products 
than  in  later  ones.  An  analysis  of  variance  was  performed  on  data 
collected  from  the  different  types  of  Inspections  in  the  sample  (Rl. 
10.  II,  12.  ITl,  and  1T2).  Figure  2 shows  the  average  number  of 
defects  found  per  page  for  each  of  the  inspection  types.  The  analysis 
of  variance  test  showed  that  at  Alpha  = 0.05.  the  defect  density  at  the 
software  requirements  inspection  (Rl)  is  significantly  higher  than  that 
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of  source  code  Inspection  (12).  and  also  the  defect  density  at  test  plan 
inspection  (IT1)  is  significantly  higher  than  that  at  test  procedures 
and  function  inspection  (IT2).  It  was  also  observed  that  the  defect 
densities  found  during  inspections  decreased  exponentially  as  the 
development  work  products  approach  the  coding  phase  (Figure  3). 

The  staff  hours  needed  to  fix  defects  were  not  found  to  be  significantly 
different  across  the  different  phases  of  the  lifecycle  (Figure  4(.  It 
should  be  noted  that  the  defects  found  and  fixed  during  these 
inspections  originated  during  the  lifecycle  phase  in  which  they  were 
detected.  Latent  defects  which  were  found  in  high  level  documents 
during  inspections  were  recorded  as  "open  issues"  and  submitted  to 
the  change  control  board.  Since  the  researchers  did  not  know  the 
timely  outcome  from  the  control  board,  these  potential  defects  are  not 
tracked  in  this  study.  However,  the  average  cost  to  fix  defects  during 
the  inspection  process  (close  to  their  origin)  was  0.5  hours,  which  is 
considerably  less  than  the  range  of  5 to  17  hours  to  fix  defects  during 
formal  testing  reported  by  a recent  JPL  project. 

Previously,  inspection  defect  counts  were  found  to  decrease  as  the 
amount  of  code  to  be  inspected  increased[Buck.  19811-  Figure  5 
shows  this  trend  to  be  sure  for  the  total  sample  of  Inspections  in  this 
study  with  respect  to  defect  density  per  page. 


The  average  inspection  team  composition  and  size  for  this  sample  are 
shown  in  Figure  6 by  type  of  inspection.  For  development  inspection 
types  (Rl,  10,  II,  and  12)  the  trend  is  for  larger  teams  for 
requirements  and  high  level  documents  while  smaller  teams  are 
needed  for  code.  The  inspection  program  at  JPL  tried  to  insure  that 
teams  were  comprised  of  members  from  organizations  having  a vested 
interest  in  the  work  product.  The  rationale  for  this  was  to  keep 
inspections  from  being  biased  toward  an  organization's  internal  view  of 
the  product. 


Figure  7 shows  the  distribution  of  defects  percentage  by  defect  types 
and  defect  categories. 
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CONCLUSIONS 


Experience  has  shown  that  formal  Inspection  of  software  is  a potent 
defect  detection  method;  and  thus,  it  enhances  overall  software  quality 
by  reducing  rework  during  testing,  as  well  as  maintenance  efforts. 

The  following  Items  highlight  the.  results  of  JPL  experience  with 
formal  Inspections. 

1.  A variety  of  different  kinds  of  defects  are  found  through 
inspections  with  Clarity.  Logic,  Completeness.  Consistency, 
and  Functionality  being  the  most  prevalent. 

2.  Increasing  the  number  of  pages  to  be  Inspected  at  a single 
inspection  decreases  the  number  of  defects  found. 

3.  Significantly  more  defects  are  found  per  page  at  the  earlier 
phases  of  the  software  lifecycle.  The  highest  defect  density 
was  observed  during  Requirements  inspections. 

4.  The  cost  in  staff  hours  to  find  and  fix  defects  was  consistently 
low  across  all  types  of  inspections.  On  average  it  took  1.1 
hours  to  find  a defect  and  0.5  hours  to  fix  and  check  it  (major 
and  minor  defects  combined). 

5.  Larger  team  sizes  (6  to  9)  for  higher  level  inspections  (R1 
and  10)  are  Justified  by  data  which  showed  an  increased 
defect  finding  capability. 

JPL  has  adopted  formal  inspections  for  many  of  its  software  intensive 
projects.  The  results  are  very  encouraging  and  show  very  significant 
improvements  in  software  quality. 
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Table  1 : Types  of  inspections  included  in  this  analysis 


Inspection  Inspection  Sample 

Abreviation  Type  Size 

R1  Software  Requirements  Inspection  23 

10  Architectural  Design  Inspection  15 

11  Detailed  Design  Inspection  92 

12  Source  Code  Inspection  34 

IT1  Test  Plan  Inspection  16 

IT2  Test  Procedures  & Functions  Inspection  22 
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Figure  1 : Overview  of  the  Software  Inspection  Process 
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Figure  2:  Defect  Density  Versus  Inspection  Types 


Development  Inspection  Types  Test  Inspection  Types 

* At  the  alphas  0.05  level  of  significance  ANOVA  F test  showed  a significant 

difference  between  the  defect  densities  of  R1  and  12,  and  between  IT1  and  IT2. 


Average  number  of 
defects  per  page 
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Figure  3.  A developed  predictive  model  for  defect 
density  as  a function  of  inspection  type 


Model:  y = 3.1 9e™" 

where  X=  1,  2, 3,  or  4 

for  R1, 10,  II  or  12  respectively 


Actual  (Avg.) 


Model 


Development  Inspection  Type 
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Staff  Hours 


Figure  4:  Staff  hours  per  defect. 


Development  Inspection  Types 


Test  Inspection  Types 


Resource  hours  lor  finding  include  all  lima  expended  during  Planning,  Overview.  Preparation,  and  Meeting  phases.  Resource  hours 
lor  fixing  Includo  all  time  expended  during  Rework.  Third  Hour,  and  Follow  up  phases.  Delects  include  all  major  and  minor  delects. 
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Figure  5.  Inspection  page  rate  versus  average  defects 

found  per  page 


0 20  40  60  80  100  120 


Number  of  pages  per  Inspection 

Note:  Inspection  "meetings"  are  limited  to  2 hours  and  moderators  are 
recommended  to  limit  material  covered  to  40  pages  or  less. 
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Figure  6:  Team  Composition  and  Size  by  Inspection  Type 


Figure  7:  Distribution  of  defects  by  classification 
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Classification 
Clarity 

Corractness/Loglc 
Completeness 
Conslatancy 
Functionality 
Compliance 
Maintenance 
Level  of  Detail 
Traceability 
RellablUty 
Performance 
Other* 

0%  5%  10%  15%  20%  25%  30%  35% 

Defect  Percentage 

* “Other"  includes  these  classifications  with  fewer  than  20  total  defects. 
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ID# 

DETAILED  INSPECTION  REPORT 

Project  
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Subsystem 

Moderator 

i /nit 

Typft  of  Ins  portion  - - - - 

PLEASE  PETUnN  COMPLETED  FOHU  TO  J.  KELLY  ATMS  IIS-333 

Defects  found  in  Inspected  Document: 

Checklist  Category 


MAJOR 


MINOR 


TOTAL 
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Inter  taco 
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liability 
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W - Wrong  E - Extra 
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JPL  INSPECTION  SUMMARY  REPORT 

Project  . Inspection  Meeting  Dale 

Subsystem Follow  Up  Completion  Dsie 

UnIKs) 

Type ol Inspection:  Quo  Q B1  □ ID  Dll  □ 12  □ IT1 

Inspection  Meeting:  Q First  Meeting  Q Re-inspection  Meeting  » Panidpams M 

8te>  Distribuion  : Now % Modified Ron 

Total  Tima  axpandad  in  Parson  Hours  (X.X) 


.%  Reused. 


□ IT2  □ Other. 
kW# fng  Clock  Hours  (X  X) 
% Dtltttd  __ 


naming  I Ovarvw  Prvpamionl  WU«tng  I fUworfc  I Foio^Up  | TtWdHour  Told 


kupociCTi 


Modwaio* 


Chock  Att  Attending  Inspection  Utadng: 
Q Project  EngioMdng  Q Titling 

Q SyaUmt  Engtotwing  Q Pioduci  Atturtnoa 

Q H/W  DovabpaiMfll  Q Opomaons 

Q S/W  Otvttoptmtnl  □ Olhtr 


Delects  found:  Major 

Open  Items  (number):  closed 
Comments: 


Defects  reworked:  Major 


Distribution: 

Manager 

Moderator 

Inspectors 


Data  Mgr. 


Ib.ee— Uvea 
lu*e—  Mm  m— V— 


Discrepancy  Reports/Change  Request(s)AVaivers: 


STATU J:  Q PASS  □ RE-INSPECTION  REQUIRED 


(lb  — i— aa»wbM>« 


Moderator's  Sicnature 


Appendix  2:  Inspection  Summary  Report 
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Appendix  3 

The  10  basic  Rules  of  Inspections: 


1 .  Inspections  are  carried  out  at  a number  of 
points  inside  designated  phases  of  the 
software  lifecycle.  Inspections  are  not 
substitutes  for  major  milestone  reviews 


2.  Inspections  are  carried  out  by  peers 
representing  the  areas  of  the  life  cycle  affected 
by  the  material  being  inspected  (usually  limited 
to  6 or  less  people).  Everyone  participating 
should  have  a vested  interest  in  the  work 
product. 

3.  Management  is  not  present  during  inspections. 
Inspections  are  not  to  be  used  as  a tool  to 
evaluate  workers. 

4.  Inspections  are  led  by  a trained  moderator. 

5.  Trained  inspectors  are  assigned  specific  roles. 
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Appendix  3 

The  10  basic  Rules  of  Inspections: 

(Continued) 


6.  inspections  are  carried  out  in  a prescribed 
series  of  steps  (as  shown  in  figure  1). 

7.  Inspection  meetings  are  limited  to  two  hours. 

8.  Checklists  of  questions  are  used  to  define  the 
task  and  to  stimulate  defect  finding. 

9.  Material  is  covered  during  the  Inspection 
meeting  within  an  optimal  page  rate  range 
which  has  been  found  to  give  maximum  error 
finding  ability. 

10.  Statistics  on  the  number  of  defects,  the  types 
of  defects  and  the  time  expended  by  engineers 
on  the  inspections  are  kept. 
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An  Analysis  of  Defect  Densities 
Found  During  Software  Inspections 

Fifteenth  Annual  Software  Engineering  Workshop 
November  28-29, 1990 
Goddard  Space  Flight  Center, 

Greenbelt,  Maryland 
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What  are  Software  Inspections? 

Software  Inspections  are: 

• Detailed  Technical  Reviews 

• Performed  on  intermediate  engineering  products 

• A highly  structured  and  well  defined  process 

• Carried  out  by  a small  group  of  peers  from  organizations  having  a 
vested  interest  in  Uie  work  product 

• Controlled  and  monitored  through  metrics  and  checklist 
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Types  of  Software  Inspections  Included  in  this  Analysis 

SAMPLE  SIZE 


• R1  Software  Requirements  Inspection  23 

• 10  Architectural  Design  Inspection  15 

• II  Detailed  Design  Inspection  92 

• 12  Source  Code  Inspection  34 

• fTI  Test  Plan  Inspection  15 

• IT2  Test  Procedures  & Functions  Inspection  23 

TOTAL  203 
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JPL  Tailoring 

of  Existing  Inspection  Techniques 

• Participants  and  Team  Composition 

• Third  Hour 

• Training 

• Support  Documentation 


JPL 

Software  Inspection  Data  Summary' 
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1:  All  times  are  averages  from  a sample  of  203  JPL  Inspections. 

2:  Guidelines  were  set  In  2/88  based  on  outside  organizations*  experience  and  a team  of  five 
7 '£  Inspectors. 

* | £ 3:  A major  defect  Is  an  error  that  would  cause  the  system  to  fall  during  operations,  or  prevent  the  system 

“ ^ from  fulfilling  a requirement  Minor  defects  are  all  other  defects  which  are  non-trivlal.  Trivial  defects  In 

£ ‘ grammar  and  spelling  were  noted  and  corrected,  but  not  Included  In  this  data  analysis. 
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Distribution  of  Defects  By  Classification 

n=  203  inspections 


Clarity 

Correctness/Logic 

Completeness 

Consistency 

Functionality 

Compliance 

Maintenance 

Level  of  Detail 

Traceability 

Reliability 

Performance 

Other* 


* “Other*  Includes  those  classifications  with  fewer  than  20  total  defects. 
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Inspection  Page  Rate  vs.  Defect  Density 


Defects 

Found 

per 

Page 


Pages  per  Inspection 


* *- 


Note:  Inspection  "meetings*  are  limited  to  2 hours  and  moderators 
are  recommended  to  limit  material  covered  to  40  pages  or  less. 
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Defect  Density  vs.  Inspection  Type 


R1*  10  II  12* 

Development  Inspection  Types 
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1.5 
1.0 
0.5 
0.0 

IT1*  IT2* 


Test  Inspection  Types 


* At  the  atpha=0.05  level  of  significance  ANOVA  F test  showed  a significant 
difference  between  the  defect  densities  of  R1  and  12,  and  between  IT1  and  IT2. 
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Predictive  Model  for  Defect  Density  vs. 

Inspection  Type 


Model : y = 3. 19  exp(-0.61x) 

where  xs  1,2,3,  or  4 

for  R1, 10,  II,  or  12  respectively  . 
' . ' / 
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Resource  Hours  per  Defect 


Development  Inspection  Types  Test  Inspection  Types 


In  contrast,  recent  JPL  projects  reported  spending  an  average  of  5 to  17 
staff  hours  to  tlx  each  defect  during  the  test  phases. 


m 

Kg'*  Resource  hours  lor  finding  include  aH  time  expended  during  Planning.  Overview,  Preparation,  and  Meeting  phases.  Resource  hours  for 
5 r fajng  indude  al  time  expended  during  Rework,  Third  Hour,  and  FoSow  up  phases.  Oeteots  indude  al  major  and  minor  detects. 
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Team  Composition  and  Size  by 
Inspection  Type 


R1  10  II  12 

Development  Inspection  Types 


|System  Eng 
gjs/WEngr 
fj^Test  Eng 
^Product  Assurance 

■Total 


IT1  fT2 


Test  Inspection  Types 
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Team  Size  vs.  Defect  Density 
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Code  Inspections  vs.  Code  Audits 


Code  Inspections 
Code  Audits 


Note:  1.  The  work  product  history  lor  code  Inspection  sample  was:  41%  new,  65%  reused,  and  4%  modified. 
The  work  product  history  for  code  audits  sample  was:  100%  new. 

2.  For  all  types  ol  inspections  combined  the  average  number  of  delects  found  per  page  was  much  higher 
than  what  was  found  In  code  inspections  (refer  to  slide  # 8).  The  overall  averages  were; 

Major  ■ 0.1 19  and  Minor  ■ 0.377,  for  a sample  size  of  203  Inspections. 


Avg.  Number  of  Defects 
Found  per  Page 

Major  Minor  Sample  Size 
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CONCLUSIONS 


• A variety  of  different  kinds  of  defects  are  found  through 
inspections  with  Clarity,  Logic,  Completeness,  Consistency,  and 
Functionality  being  the  most  prevalent. 

• Increasing  the  number  of  pages  to  be  inspected  at  a single 
inspection  decreases  the  number  of  defects  found. 

• Significantly  more  defects  were  found  per  page  at  the  earlier 
phases  of  the  software  lifecycle.  The  highest  defect  density  was 
observed  during  Requirements  inspections . 

• The  cost  in  staff  hours  to  find  and  fix  defects  was  consistently 
low  across  all  types  of  inspections.  On  average  it  took  1 .1  hours 
to  find  a defect  and  0.5  hours  to  fix  and  check  it  (major  & minor 
defects  combined). 

• Larger  team  sizes  (6  to  8 engineers)  for  higher  level 
Inspections  (R1  & 10)  are  justified  by  data  which  showed  an 
increased  defect  finding  capability. 

• Code  Inspections  were  superior  in  finding  defects  over  Code 
Audits  (single  reviewer)  by  a factor  of  3. 
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THE  SOFTWARE  METRICS  PROGRAM  IN 

MOTOROLA 

- Corporate-wide  activities  since  1988  (parts  of  the  com- 
pany have  started  earlier  than  that) 

- Emphasis  on  building  an  organizational  infrastruc- 
ture: 

Metrics  Working  Group  (MWG) 

Metrics  Users  Group  (MUG) 

Metrics  champions  within  Corporate  R&D  and 
within  business  units  (metrics  entities) 

Two-day  training  workshop  on  software  metrics 

Consulting  on  the  use  of  software  metrics  for 
process  improvement 

- A set  of  software  metrics  has  been  defined  and  it  is  re- 
quired by  the  Quality  Policy  for  Software  Development 

- Additional  metrics  are  used  by  projects  as  necessary 

- The  goal  is  not  measurement.  The  goal  is  improve- 
ment through  measurement,  analysis,  and  feedback 

MOTOROLA,  INC.  u p^iMataukb 
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MOTOROLA,  INC.  - FIRST  MALCOLM  BALDRICE  NATIONAL  QUALITY  AWARD 


OBSTACLES  WE  HAD  TO  OVERCOME 

- Setting  up  a “system”  to  capture  the  data;  initial  lack  of 
tools  that  automate  metrics;  some  projects  wanted  to 
use  metrics,  but  did  not  have  the  tools 

Getting  people  to  track  the  data;  fear  that  other  proj- 
ects do  not  report  the  data  consistently 

- Middle-level  management  resistance  in  implementing 
the  metrics;  fear  of  overhead  and  extra  cost;  percep- 
tion that  the  data  may  be  used  against  them 

- Required  a cultural  change  in  the  software  community: 
it  met  resistance,  although  Motorola  has  already  im- 
plemented measurement  systems  in  other  areas,  and 
change  is  part  of  the  everyday  business 
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COST  OF  THE  METRICS  PROGRAM 


- Cost  is  involved  in  terms  of  establishing  the  program, 
as  well  as  in  terms  of  operational  costs 

- The  MWG  meets  twice  a quarter  (about  8 people  pres- 
ent) 

- The  MUG  meets  quarterly  (about  15  people  present) 

- Tool  development  cost 

- Example  in  a Division:  3 person-years  per  year  for  ap- 
proximately 350  software  engineers  (less  than  1%  of 
resources) 

- Example  in  a Division:  0.75  person  years  per  year  for  ap- 
proximately 70  software  engineers  (about  1%  of  re- 
sources) 

- Post-release  metric  costs  have  been  insignificant 
compared  to  benefits;  in-process  metrics  are  very  use- 
ful, but  they  have  been  more  costly  and  need  to  be  auto- 
mated 

- In  general,  the  overall  cost  is  acceptable  and  justified 
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BENEFITS  SO  FAR 


- People  started  thinking  about  software  process  and 
quality;  the  data  has  helped  understand  several  prob- 
lems and  show  how  bad  these  problems  were 

- Metrics  helped  establish  baselines,  and  focus  on  ac- 
tions with  quantitative  results;  there  are  cases  of  sig- 
nificant quality  and  productivity  improvements  (exam- 
ple within  a Division:  50X  quality  improvement  in  the 
last  3.5  years) 

Presenting  the  metrics  charts  did  not  improve  the 
quality  by  itself;  it  is  the  quality  initiatives  ^aken  as  a 
result  of  looking  at  the  charts  that  made  the  differ- 
ence 

Many  indirect  benefits  (e.g.,  helped  improve  ship-ac- 
ceptance criteria,  helped  improve  schedule  estimation 
accuracy,  etc.) 
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LONG  RANGE  BENEFITS  EXPECTED 

- Software  groups  learn  from  mistakes  of  previous  proj- 
ects and  take  action  to  avoid  them 

- Significant  improvement  in  customer  satisfaction  due 
to  improved  quality 

- Significant  cost  reduction  due  to  improved  quality  (re- 
duced cost  of  rework;  resources  are  freed  up  to  work  on 
new  software  development) 

- Productivity  improvement  is  expected  to  reduce  cycle 
time,  allowing  the  products  to  reach  the  market  at  the 
right  time 

- Remember:  metrics  can  only  show  problems  and  give 
ideas  as  to  what  can  be  done;  it  is  the  actions  taken  that 
bring  the  benefits 
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Experiences  in  Establishing 
and  Maintaining 

an  Effective  Measurement  Program 

■ What  does  our  measurement  program  look  like? 

■ What  obstacles  did  we  have  to  overcome? 

■ What  are  the  costs  of  such  a program? 

■ What  are  the  benefits  so  far? 

■ What  long-range  benefits  are  expected? 
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HP's  STANDARD  SET  OF  METRICS 

Consists  of  Code  Volume,  Effort,  and 
Defect  Definitions: 

• NonComment  Source  Statement  (NCSS)  - non-comment 

physical  lines  of  code  including  compiler  directives, 
data  declarations,  print  statements  and  executable 
code.  Each  include  file  counted  only  once,  (pp58) 

• Engineering  Month  (EM)  - sum  of  calendar  payroll  months, 

attributed  to  each  project  engineer,  including  testing, 
adjusted  to  exclude  extended  vacations  and  leaves; 
excludes  time  project  managers  spend  on 
management  ta§ks.  tpP54l 

• Defect  - any  deviation  from  the  specification  and  any 

| errors  in  the  specification.  lpp-56l 


CSG  QUALITY/TXS 

PI  I Ml  II  H/I/'IO 


©19'J0  Hewlett-Packard  Co. 


HEWLETT 

PACKARD 


4- 


01J»C4«J 


-*  . 


1 


Postrelease  Discovered  Defect  Density 

6-month  Upper 

Group  A Group  B <12  Months  Average  Control 

^Detects  per  KNCSS  (normalized)  
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Month  of  Producl's  One~Year  Evaluation 
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Open  Serious  & Critical  KPR's 

10X  Goal  Actual 
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What  Obstacles  Did  We  Have  to  Overcome? 

► Perceptions  of  metrics. 

► Selling. 

- Top  management 

- Project  managers 

- Engineers 

► Too-rapid  change. 

- Leap  before  you  look 

- More  is  better 

- Desperation  for  a breakthrough 


► Organizational  changes. 
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What  Did  We  Do  Right? 

► Council. 

► Started  small. 

► Created  an  environment  for  reinforcing  success. 

- top  management  training 

- Software  Engineering  Productivity  Conferences 

- productivity  managers 

► Metrics  class. 

► Good  tool  support. 
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Hierarchy  of  Metrics  Acceptance  and  Practice 


Data  collection 
automated;  analysis 
rith  expert  system  support 
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Experiments  validating  best 
practices  with  data 


Common  terminology;  data  comparisons 
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Top  Eight  Causes  of  Defects  for  One  Division 


Specifications/ 

Requirements 


Design 


Code 


Standards  6.7% 


Environmental 

Support 


Logic  Imp!  19 


SW  10.3% 


Integration  9.7% 


Error  Chk  19.3% 


888882 


User  Interface  9.3% 


Data  Def  7.7% 


User  Interface  18.0% 


Software  Development  Technology  Lab 

de!5-U/26/90 


© 1UUO  Hcwtetl'  PacKard  Co. 


ra 


HEWLETT 

PACKARD 


t 


£ Manufacturing  Productivity  Division 

effhand-8/11/90 


© mO  I tewleU-PacKafd  Co. 


m 


HEWLETT 

PACKARD 


SPECTRUM  TECHNICAL  PROGRAM 

Post-release  Incoming  SRs  by  Customers 

NOT  CERTIFIED/  WORST  CERT  CERTIFIED 

CERTIFICATION  PR0DUCT  PR0DUCTS 


SRs  submitted  (normalized  by  KNCSS) 
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EXPERIENCES  IN  ESTABLISHING  AND 
MAINTAINING  AN  EFFECTIVE 
MEASUREMENT  PROGRAM 


Based  on  Managing  Programming 
Measurement,  ITT  Programming,  Advanced  Technology  Center 
from  1981  to  1986  at  Stratford,  Connecticut 


Ray  Wolverton 
Hughes  Aircraft  Co. 
Los  Angeles 
(213)  414-5515 


A Presentation  to  the  15th  Annual  Software  Engineering  SEL  Workshop, 

NASA/Goddard 
November  28-29, 1990 
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Programming  Measurement 


Programming  Measurement  Strategy 

• Develop  yearly  baselines  for  progress  comparison 

• Develop  project  performance  reporting  system  for 
management 

• Develop  forecasting  and  diagnostic  measurement  procedures 
for  use  by  project  personnel 

• Establish  and  enhance  integrated  measurement  system 
through  yearly  growth  in  the  type  and  accuracy  of 
measurements  reported 
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CONTEXT  AND  CIRCUMSTANCES 


# Took  three  years  to  collect  data  and  show  trends 

— 20  companies 

— 106  projects 

— 7.2  million  developed  statements 

— 12.4  million  delivered  statements 

— 3,073  personyears 

• Productivity,  calculated  as  developed  statements/personyear, 
measures  the  efficiency  of  developing  new  and  modified  code 

O Project  productivity  is  averaged  by  application  type  and  year  completed 

• Telecommunication  projects  represent  45%  of  the  projects  and  71  % of 
the  effort  in  our  last  baseline 

# Quality  trends,  quality  Improvements,  and  quality-productivity  relation- 
ship was  examined  indopth 
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Programming  Measurement  Objectives 

• Develop  project  performance  measures  relating  to  economic 
productivity,  costs  and  profits 

• Reduce  programming  production  costs  thru  increased 

j ( efficiency 

• Provide  an  early  warning  of  project  productivity,  quality, 
schedule  or  cost  difficulties 

• Improve  the  development  and  defense  of  competitive  bids 

• Track  programming  productivity  and  quality  trends 

• Compare  ITT  to  industry  performance 

• Measure  the  effectiveness  of  alternative  programming 
methods/tools 

• Understand  the  effect  of  management  decisions  on  the 
programming  process 

• Track  the  progress  of  programming  projects 

• Analyze  the  impact  and  effectiveness  of  tools,  methodology 

! | and  technology 

s • Improve  the  allocation  of  resources 
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Measurement  Programs 

• Baselines 

• Performance  factors 

• Leading  indicators 

• Resource  estimating 

• Quality  profile 

• Programmer  manager  development 

• Programmer  competency  and  task  analysis 

• Product  cost  system 

• Programming  cost  estimating 


Programming  Productivity,  Quality 
and  Cost 

Major  Factors 

• Modern  programming  practices 

• Programming  personnel 

• Organizational  structure 

• Tools  available 

• Restarts  and  direction  changes 

• Attrition 

• Number  of  locations 

• Project  complexity 

• Computer  availability 

• Project  objectives  & requirements 

• Physical  environment 

• Initial  plans,  schedules  & cost  estimates 
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PROGRAMMING  MEASUREMENT  TEAM 

I 

LIT 


Bert  Albert 
Bill  Curtis 
Sue  Hoben 
Yuan  Liu 
Hank  Malec 
John  Vosburgh 
Ray  Wolverton 


Data  Coordinator 
Human  Factors 
MIS  Specialist 
Statistical  Modeling 
Quality  Issues 
Statistical  Analysis 
Management 


CONTRACT 


Tom  Jopling 
Bruce  Roberts 
Lynn  Truss 
Barbara  Conway 


FOCUS  Programmer 
Yale  Grad  Student 
Yale  Grad  Student 
Graphics 


R.  Wafrertaa 


I 


Overall  Approach  to  the  Study  of  Productivity 
Performance  Factors 


MEASUREMENT 

QUESTIONNAIRE 


• SINGLE-ORDER  • REGRESSION 

CORRELATIONS 

• MEANS.  OTHER  ..;„i,.lLATION 

DESCRIPTIVE 

STATISTICS 

• PLOTS 


Plotting  of  Personyears  versus  Developed  Statements 


DEVELOPED  STATEMENTS 


Measurement  Strategy 

Responsibilities 

Programming  Unit 


Establish 
methodology 
and  tools | 

Establish  unrt 
responsibility 
and  collect  data 


^nalyzes  Data 


Goal 

setting 


Improvement 
tracking 

Enhanced 
methodology 
and  tools 


i Assist  ! 

j unit  i 

i goal  setting j 
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Strongly  Correlated  Groups  of  Productivity 
Factors 
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Breakdown  in  the  Variation  of  Productivity 
of  Non-MIS  Programming  Projects  in  ITT 


Controllable  factor: 
Development-Computer 
(cross  development  only) 


Controllable  factors: 
Client-Experience 
Client-Participation 


Controllable  factors: 
MPP-Usage 
Staffing-Level 
Personnel-Experience 


Controllable  factors: 
Requirements-Specification 
Requirements-Rewrite 
Concurrent-Hardware 


Uncontrollable  factors: 
Target-Computer 
Timing 

Memory-Utilization 

Application-Complexity 


Other  Variables: 

Incorrect  data 

Newness  of  application  or  design 
Documentation  requirements 

...  . . ..  . Factors  unique  to  programming  environment 

• 13  Productivity  factors  explain  % of  Other  unidentified  variables 

variation  in  productivity 


• When  combined  with  developed 
statements,  these  13  productivity  factors 
explain  90  percent  of  the  variation  in 
effort 
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Relationship  Between  Productivity  and  Quality 


TESTING 

DEFECTS 

PER 

1000 

DEVELOPED 

STATEMENTS 
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PRODUCTIVITY 


Scatterplot  of  Cost  Per  Statement  Against  Productivity 


DOLLARS 

PER 

STATEMENT 


* 7 MIS  projects  with  productivity  above  6000  statements  per  personyear 
are  out  of  range. 
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Relationship  Between  Dollars  Per  Statement 
and  Productivity  for  Various  Levels  of 
Burdened  Personyears 


PRODUCTIVITY 


Distribution  of  Productivity  with  Respect  to 

Staffing-Level 
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Programming  Meaauramant 


Programming  Measurements 


10  Leading 

• Code  production 

• Defect  removal 

• Test  achievement 

• Defect  rates 

• Test  effectiveness  index 


Indicators 

• Schedule  index 

• Problem  index 

• Development  hours  index 

• Development  cost  index 

• Work  performed  index 
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CODE  PRODUCTION  VERSUS  DATE  - INDICATOR  II  RUN  DATE*  02/Zft/A) 
COMPANY : SEL  PROJECT*  12*0 


S STARTING  LIT  (KLOC) 

C COMPLETING  MAT  (KLOC  I 
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OCTOBER 


NOVEMBER  - 
FEBRUARY  1983 


THE  LAG  TIME  IS 
BUILDING.  THE 
MODULE  MANAGER 
SHOULD  TAKE 
ACTION. 

LAG  TIME  IS 
DOUBLE  THE  20 
WEEKS  IN  THE 
PLAN.  TAKE 
ACTION  NOW 

GOOD  WORK. 
ACTION  TAKEN 
HAS  GOOD  RESULT 
STAY  ON  TOP  OF 
THE  SITUATION. 
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Programming  Productivity  Target 
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Programming  Quality  Target 
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Programming  Measurement 


Unit  Reaction  to  Program  Introduction 

“Need  It” 

"Want  It” 

BUT 

“In  The  Middle  of  Project” 

“No  Resources” 

“Takes  Time  to  Collect  Data” 

Unit  Reaction  to  Initial  Testing  & Installation 

“Can't  Get  All  The  Data” 

“Can’t  Show  Data  - No  Actions  in  Place” 

“Data  Not  Accurate" 

“Internal  Use  Only” 
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1.  Introduction 


Japanese  believe: 

"What  other  persons  do  are  right  things  to  do." 
In  this  context,  ''other  persons"  could  be: 

• Other  organizations  in  a division 

• Other  divisions  in  a company 

• Other  companies  in  the  industry 

• Other  industries  in  Japan 

• US 
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2.  What  do  Japanese  measurement  programs  look  like? 


The  most  common  and  basic  measures  are: 


• Size:  KLOC 

non-commentary  source  lines  of  code  including  reused  source 
code 

• Productivity:  LOC  per  Programmer  Month 

indirect  activities  (e.g.,  tool  development)  not  included 

• Quality:  Errors  per  KLOC 

errors  reported  during  the  development  phases  and  12  month 
after  release 


Though  there  are  differences  in  details,  the  measures  are 
conceptually  same  as  those  in  US. 
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3.  What  obstacles  did  Japanese  have  to  overcome? 

We  had  to  agree  upon  what  should  be  measured: 

• what  kind  of  data  should  be  collected? 

• how  should  data  be  analyzed? 

• how  should  results  be  fed  back? 

The  answers  are  different  ->  no  standard  measures  exist. 
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4.  Who  defines  a measurement  system  for  an  organization? 


There  are  two  cases: 


• Central  software  technology  support  group 
e.g.,  NEC,  Toshiba,  Mitsubishi,  NTT 


• Quality  assurance  or  equivalent  organizations 
e.g.,  Fujitsu,  Hitachi,  Oki,  IBM  Japan 


Based  on: 

• De  facto  standard  measures  (e.g.,  KLOC) 


• Working  papers  by  various  committees  (e.g.,  JSA) 


M.  Ohfw 
IBM/Japan 
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5.  What  is  the  cost  of  such  programs? 

Needs  a centralized  organization  which  is  responsible  for 

• defining  measures  and  evaluation  systems 

• defining  data  to  be  collected 

• defining  "simple"  methods  for  analyzing  data 

• developing  tools  for  collecting  and  analyzing  data 

• maintaining  the  database 

• providing  education 
It  is  expensive. 
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6.  What  are  the  benefits  so  far? 


'Takes  at  least  three  years  to  see  the  changes/ 


• Management  by  quantitative  objectives 

by  setting  objectives  and  reviewing  achievements 
(e.g.,  software  reliability  growth  estimation) 


• Standard  and  consistent  control  of  software  process 
by  defining  the  upper  and  the  lower  control  limits 
(e.g.,  quality  probe  by  Hitachi) 


• Incremental  and  continuous  process  improvement 
by  setting  an  annual  goal  for  an  organization 
(e.g.,  "Ayumi  (growth)"  program  by  Fujitsu) 


As  a result,  the  defect  rate  has  gone  down  from  5 Errors/KLOC  to 
0.1  Errors/KLOC  during  the  last  ten  years. 
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7.  What  long  range  benefits  are  expected? 


"Optimization"  of  software  process: 

• Design  the  best  process  for  a project 
based  on  past  experiences 


• Monitor  and  manage  the  process  properly 
based  on  data  analysis 


• Reconfigure  the  process  dynamically  if  needed 
based  on  data  analysis  results  and  experiences 


This  is  the  ideal  "Software  Factory"  and  what  engineers  in  other 
Japanese  industries  have  done  last  25  years. 


MOfaba 
IBM/ Japan 
P*t*Saf» 
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-f  Carnegie  Mellon  University 
s Pittsburgh,  PA  15213 

Sponsored  by  the  U.S.  Department  of  Defense 
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Car  nagia  MatJon  Univarsity 

Software  Engineering  Institute 


Disappointments 


Ada  acceptance  - military,  industry,  education 
Acceptance  of  software  engineering  in  education 
Standards  for  integration  of  CASE  tools 
Code  reuse  - again 
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Carnegia  Mellon  University 

Software  Engineering  Institute 


Achievements 


* 


Ada  standardization  and  evolution  of  supporting 
technology 

Focus  on  process  supported  by  measurement  and 
education 

Emergence  of  software  engineering  in  education 
Realization  of  importance  of  software  architecture 
Emergence  of  software  engineering  environments 

!?r 

f Object-oriented  design  notions 
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Carnegie  Mellon  University 

Software  Engineering  Institute 


Next  Five  Years 


Software  architecture 

Maturation  of  object-oriented  design 

Data  collection  - consistent  measures 
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Imperial  College 


SOFTWARE  ENGINEERING  IN  THE  1980’s 
Most  Significant  Achievements  / Greatest  Disappointments 

(Panel) 

15TH  ANNUAL  SOFTWARE  ENGINEERING  WORKSHOP 
GODDARD  SPACE  CENTER 
28  - 29  November  1990 


M M Lehman 
Department  Of  Computing 
Imperial  College  Of  Science  Technology  & Medicine 
& 

Lehman  Software  Technology  Associates  Ltd 
London  SW7  2BZ 
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MOST  SIGNIFICANT  ACHIEVEMENT 


A 

CASE 


GREATEST  DISAPPOINTMENT 

CASE 
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MOST  SIGNIFICANT  ACHIEVEMENTS 


NOTE  THAT  TERM  IS  MOST  SIGNIFICANT  NOT  GREATEST 

• INCREASING  APPRECIATION  OF  THE  INTRINSICALLY 

EVOLUTIONARY  NATURE  OF  SOFTWARE 

it  (IN  MY  VIEW) 

• INCREASING  RECOGNITION  OF  THE  SIGNIFICANCE  & 

ROLE  IN  SOFTWARE  DEVELOPMENT  & EVOLUTION 
OF  THE  PROCESS  & OF  PROCESS  MODELS 

U (IN  MY  VIEW) 

• MUST  EVENTUALLY  LEAD  TO  WIDER  APPRECIATION 

AND  ACCEPTANCE  OF  THE  IMPORTANCE  OF  DISCIPLINE, 
METHOD,  FORMALITY  & MECHANISATION  IN  THAT 
PROCESS  AND  TO  THEIR  GENERAL  APPLICATION 

ii  (IN  MY  VIEW) 

• THE  DEVELOPMENT  OF  SIGNIFICANT  CASE  TOOLS 

& OF  THE  CONCEPT  OF  INTEGRATED  SUPPORT 
ENVIRONMENTS 


THE  NEED  FOR  CASE  DEVELOPMENT  IS  A LOGICAL  CONSEQUENCE 
OF  THE  FIRST  LISTED  ADVANCES  BUT,  IN  PRACTICE,  HAS  PRIMARILY 
ARISEN  INDEPENDENTLY  FROM  THE  SEARCH  FOR  PRODUCTIVITY 
GROWTH  IN  SOFTWARE  DEVELOPMENT 


O Lehman  Software  Technology  Associates  Ltd  , Nov.  28,  1990 
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GREATEST  DISAPPOINTMENT 


• THE  FAILURE,  BY  & LARGE,  OF  CASE  TO  DELIVER 

EVIDENT  PRODUCTIVITY  GROWTH 

'll  (IN  MY  VIEW) 

• THE  LACK  OF  LARGE  SCALE  PENETRATION  OF 

CASE  WIDELY  OR  DEEPLY  INTO  INDUSTRIAL  & 
COMMERCIAL  SOFTWARE  DEVELOPMENT 


• THE  VERY  SLOW  DEVELOPMENT  OF  SATISFACTORY, 
COMPREHENSIVE,  TRANSFERABLE  & USABLE 
SUPPORT  ENVIRONMENTS 


• FAILURE  TO  ACHIEVE  WIDE  INDUSTRY  APPRECIATION 
OF  THE  TRUE  MEANING  OF  SOFTWARE  ENGINEERING 
& THE  ROLE  OF  SOFTWARE  ENGINEERS 


• l.climjn  Software  Technology  Associates  Ltd . Nov.  28,  1990 


M.  Lehman 


SOFTWARE  ENGINEERING  - SOFTWARE  ENGINEER 


JJ 


(IN  MY  VIEW) 


PROCESS  ENGINEERING  - PROCESS  ENGINEER 


CONTRASTS  WITH 

PROGRAMMING  AS  PRODUCT  ENGINEERING 


• SOFTWARE  ENGINEERS’  FOCUS  - PROCESSES  BY 

WHICH  SYSTEMS  ARE  DEVELOPED , PRODUCTS 
CREATED  & MAINTAINED  SATISFACTORY 

• PRIMARY  CONCERN:  DESIGN , CONTROL , SUPPORT 

OF  DEVELOPMENT  & EVOLUTION  PROCESS 

- PROCESS  ITSELF 

- METHODS 

- TOOLS 


• SELECT,  DEVELOP,  REDUCE  TO  PRACTICE 

- METHODS  & TECHNIQUES 

- PRACTICES  & PROCEDURES 

- DIRECT  TOOLS 

- GENERAL  SUPPORT 

• INTEGRATE  & INSTALL  METHODS,  TOOLS  & IPSEs 

TO  PROVIDE  SUPPORT  & INFORMATION  RETENTION 
OVER  ORGANISATION,  APPLICATION,  PROJECT 

• INVOLVEMENT  WITH  SPECIFIC  SYSTEM 

- PROJECT  & PROCESS  DESIGN 

- PLANNING 

- DEVELOPMENT  OF  PROJECT- 

SPECIFIC  =>  METHODS 
TOOLS 

- MANAGEMENT  SUPPORT 

- PROCESSES  MANAGEMENT 


0 Lehman  Softw*e  Technology  Anoriatrt  Ud , Nov.  23,  1990 
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THE  ROLE  OF  CASE 


• SUCH  OVERALL  ATTRIBUTES  CANNOT  BE  ACHIEVED 
BY  CLASSICAL  PROCESS 


• OR  B Y UNCO-ORDINATED  INTRODUCTION  OF 
INDIVIDUAL  METHODS  OR  TOOLS 


DEMANDS  CO-ORDINATED,  PROCESS  WIDE, 
INTRODUCTION  OF  - SYSTEMATIC  & DISCIPLINED  METHODS, 

- CASE  TOOLS  FOR  THEIR  SUPPORT 

- AN  INFORMATION  PRESERVING 

REPOSITORY, 

- ACTIVE  PROCESS  GUIDANCE 


• PRIMARY  GOAL  OF  CASE  CANNOT  BE  IMMEDIATE 

PRODUCTIVITY  GROWTH,  COST  REDUCTION,  OR 
VISIBLE  IMPROVEMENTS  IN  PRODUCT  QUALITY 


• IMPROVED  COST-EFFECTIVENESS  ULTIMATE  BENEFIT 


• VISIBILITY  DEPENDS  ON  MEETING  ALL  ABOVE 

REQUIREMENTS  OVER  PERIOD  OF  TIME 

• INTRODUCTION  OF  CASE  MUST  SEEK  TO  YIELD  TOTAL 

PROCESS  THAT  AS  FAR  AS  POSSIBLE,  ACHIEVES  & 
MAINTAINS,  USER  SATISFACTION  WITH  PRODUCT 

• THIS,  IN  TURN,  WILL  EVENTUALLY  PRODUCE 
PRODUCTIVITY  INCREASE  & REVENUE  GROWTH 
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SOME  OBSTACLES  - to  rapid  penetration  of  case  technology 


• LONG  LEAD  TIME  TO  VISIBLE  BENEFIT 

=»  INDIVIDUAL  & COLLECTIVE  LEARNING  & TRAINING 

=»  AND  CHANGE  OF  ATTITUDES,  HABITS  & PRACTICES 

=»  WRITE  OFF  OF  CURRENT  PRODUCER  INVESTMENT 
eg  TOOLS,  PROCEDURES,  EXPERIENCE 

=>  ACHIEVING  SIGNIFICANT  BENEFIT  DEPENDS  ON  COHERENT 

COVERAGE  OF  MAJOR  PORTION  OF  DEVELOPMENT  PROCESS 

=»  BENEFITS  OF  CASE  BECOME  SIGNIFICANT  ONLY  TOWARDS  SYSTEM 
COMPLETION  OR  AFTER  EXTENSIVE  USAGE  EXPERIENCE 

• MAJOR  INVESTMENT  BY  SOFTWARE  PRODUCERS 

=9  CAPITAL  - eg.  TOOLS 

=9  MANPOWER  - eg.  DEVELOPMENT  &TRAINING 

=9  DISRUPTION  IN  PRODUCTION 

^ RETURN  DELAYED  TILL  USERS  PERCEIVE  FISCAL  BENEFIT 

• IMPONDERABLES  IN  COST/BENEFIT  ANALYSIS 

PRODUCERS 

=9  HIDDEN  BENEFITS 

eg  CONSERVATION  OF  SKILLED  MANPOWER 

=9  ANTIREGRESSIVE  BENEFITS 

eg  POTENTIAL  REDUCTION  IN  EMERGENCY  RESPONSE  TO  USER  PROBLEMS 

=*  MARKETING  VALUE  OF  PRODUCT  QUALITY  IMPROVEMENT 
eg  SIMPLER  TO  LEARN,  MORE  RESPONSIVE  MAINTENANCE 

USERS 

- IMPROVED  QUALITY 

eg  SIMPLER  TO  LEARN,  INSTAL,  MAINTAIN,  ADAPT 

- ANTIREGRESSIVE  BENEFITS 

eg  REDUCTION  IN  USER  DOWN  TIME  & LOSSES 

- HIDDEN  BENEFITS 

eg  FASTER  RESPONSE  TO  CHANGE  AND  EVOLUTION  REQUESTS 

• INSUFFICIENT  EXPERIENCE,  DATA  OR  THEORY  FOR 

CONVINCING  DETERMINATION  OF  BENEFIT 
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Achievements  and  Disappointments 
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Significant  Achievements  Past  Decade 

Spiral  Model  of  Software  Development 
in  Place  of  Waterfall  Model 

Significant  Developments  of  Metrics 
for  Software  Technical  Management 

Establishment  of  National  Resource  in 
Software  Engineering  Institute 

Cleanroom  Engineering  of  Software 
under  Statistical  Quality  Control 
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Greatest  Disappointments  Past  Decade 

Use  of  Software  Engineering  as  a Buzzword, 
not  as  a Real  Engineering  Discipline 

Continued,  Widespread,  but  Unnecessary 

• Poor  Quality,  Unreliable  Software 

• Low  Productivity  Software  Development 

• Missed  Schedule  Software  Deliveries 
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Cleanroom  Engineering  of  Software 
under  Statistical  Quality  Control 


Statistical  Usage  Specifications  as  well  as 

Function  and  Performance  Specifications 

Software  Development  in  a Pipeline  of 
Increments  with  Separate  Certification 

Scaled  Up  Informal  Verification  of  Software 
to  Meet  Specifications 

Producing  Software  without  Private  Debugging 
before  Public  Certification  Testing 
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Significant  Achievements  Next  Decade 

Formalization  of  Spiral  Models  of  Software 

Development  for  Procurement/ Management 

Continued  Developments  of  Metrics 
for  Software  Technical  Management 

Continuation  of  National  Resource  in 
Software  Engineering  Institute 

Expanded  Cleanroom  Engineering  of  Software 
under  Statistical  Quality  Control 
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Software  Engineering  in  the  1980’s: 

The  most  Significant  Achievements  and  Greatest  Disappointments 


What  have  been  the  most  significant  achievements  for  software  engineering 
in  the  past  10  years? 

What  have  been  the  greatest  disappointments  for  software  engineering  in  the 
past  10  years? 

What  is  the  objective  or  subjective  criteria  supporting  your  assessments? 

What  software  engineering  advances  will  make  the  most  significant 
contribution  in  the  next  five  years? 


Significant  Achievements 


Maturing: 

Recognition  of  importance  of  Process,  Formal  Methods 
Recognition  of  need  for  Multiple  Life  Cycle  Models,  Methods,  etc. 

Technologies: 

Measurement 

Use  of  Data  Abstractions  and  Object  Oriented  methods 
Use  of  Ada 


Greatest  Disappointments 


That  the  maturing  has  taken  so  long 

That  some  people  are  still  looking  for  magic 

The  lack  of  wide  spread  use  of  measurement,  and  formal  methods 

The  lack  of  effective  automated  support  for  software  development 

The  lack  of  advance  in  testing  practices 


Future  Achievements 


Next  5 years: 

Focus  on  Engineering 

Wider  spread  use  of  process  improvement  through  m,  auurement 
Reuse  of  Packaged  Experience 


Next  10  years: 

Real  automated  support 

Maturing  of  personnel  with  consistent  background 
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